750 likes | 1k Views
Pests and Diseases Forewarning System. Amrender Kumar. Scientist Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi, INDIA akjha@iasri.res.in. Crop – Pests - Weather Relationship. Crop. Weather. Pests.
E N D
Pests and Diseases Forewarning System Amrender Kumar Scientist Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi, INDIA akjha@iasri.res.in
Crop – Pests - Weather Relationship Crop Weather Pests
Diseases and pests are major causes of reduction in crop yields. • However, in case information about time and severity of outbreak of diseases and pests is available in advance, timely control measures can be taken up so as to reduce the losses. • Weather plays an important role in pest and disease development. • Therefore, weather based models can be an effective scientific tool for forewarning diseases and pests in advance.
Why pests and disease forewarning • Forewarning / assessment of disease important for crop production management • for timely plant protection measures • information whether the disease status is expected to be below or above the threshold level is enough, models based on qualitative data can be used – qualitative models • loss assessment • forewarning actual intensity is required - quantitative model
Variables of interest • Maximum pest population or disease severity. • Pests population/diseases severity at most damaging stage i.e. egg, larva, pupa, adult. • Pests population or diseases severity at different stages of crop growth or at various standard weeks. • Time of first appearance of pests and diseases. • Time of maximum population/severity of pests and diseases. • Weekly monitoring of pests and diseases progress. • Occurrence/non-occurrence of pests & diseases. • Extent of damage.
Data Structure Historical data at periodical intervals for 10-15 years
Historical data for 10-15 years at one point of time • overall status • disease intensity • crop damage.
Data for 5-6 years at periodic intervals • For week-wise models, data points inadequate • combined model for the whole data in two steps • Data at one point of time for 5-6 years • Model development not possible • Qualitative data for 10-15 years • Qualitative forewarning • Occurrence / non-occurrence of disease • Mixed data – conversion to qualitative categories • Data collected at periodic intervals for one year • Within year growth model
Choice of explanatory variables • Relevant weather variables • appropriate lag periods depending on life cycle • Crop stage / age • Natural enemies • Starting / previous year’s last population of pathogen
Forecast Models • Between year models • These models are developed using previous years’ data. • The forecast for pests and diseases can be obtained by substituting the current year data into a model developed upon the previous years. • Within year models • Sometimes, past data are not available but the pests and diseases status at different points of time during the current crop season are available. • In such situations, within years growth model can be used, provided there are 10-12 data points between time of first appearance of pests and diseases and maximum or most damaging stage. • The methodology consists of fitting appropriate growth pattern to the pests and diseases data based on partial data.
Thumb rules • Most common • Extensively used • Judgment based on past experience with no or little mathematical background Example A day is potato late blight favorable if • the last 5 - day temperature average is < 25.50 C • the total rainfall for the last 10 days is > 3.0 cm • the minimum temperature on that day is > 7.20 C Trivedi et al. (1999)
Regression models • Relationship between two or more quantitative variables • The model is of the form Y = 0 + 1 X1+2 X2 ………. +p Xp + e , where • i’s are regression coefficients • Xi’s are independent variables • Y variable to forecast • e random error • Variables could be taken as such or some suitable transformations
Cotton • % ofincidence of Bacterial blight (Akola) – Weekly models (42nd to 44th SMW) • Data used: 1993-1999 on MAXTemp, MINTemp, RH1 (morn), RH2 (aft) and RF – [X1 to X5) lagged by 2 to 4 weeks • Model for 44th SMW Y= 133.18 - 3.09 RH2L4 + 1.68 RFL4 (R2=0.78)
Potato • Potato aphid is an abundant potato pest and vector of potato leaf-roll virus, potato virus Y , PVA, etc. • Potato aphid population – Pantnagar (weekly models) • Data used: 1974-96 on MAXT, MINT and RH – [X1 to X3) lagged by 2 weeks • Model for December 3rd week Y = 80.25 + 40.25 cos (2.70 X12 - 14.82) + 35.78 cos (6.81 X22 + 8.03)
GDD = (mean temperature – base temperature) The decision of Base temperature Initial time Not much work on base temperature for various diseases Normally base temperature is taken as 50 C Under Indian conditions, mean temperature is seldom below 50 C Use of GDD and simple accumulation of mean temperature will provide similar results in statistical models Need for work on base temperature and initial time of calculation GDD approach
Under Indian conditions, other variables also important • Model using simple accumulations not found appropriate • Models based on weighted weather indices where
Y variable to forecast xiw value of ith weather variable in wth period riw weight given to i-th weather variable in wth period rii’w weight given to product of xi and xi’ in wth period p number of weather variables n1 and n2 are the initial and final periods for which weather variables are to be included in the model e error term
Experience based weights • Subjective weights based on experience. • Weather variable not favourable : weight = 0 • Weather variable favourable : weight = ½ • Weather variable very favourable : weight = 1
Example : • Favourable relative humidity 92% • Most favourable relative humidity 98% • Weather data • Year Week No. • 1 2 3 4 5 6 • 1993 88.7 90.1 94.4 98.3 98.0 95.0 • 94.0 93.3 94.9 93.3 92.0 88.1 • 90.3 91.9 90.4 87.9 86.4 89.7 • ---------------------------------------------------------------- • ----------------------------------------------------------------
Weighted Index • 0x 88.7 + 0x90.1 + 0.5 x 94.4 + 1 x 98.3 + • 1 x 98 + 0.5 x 95 = 271.0 • 0.5 x 94 + 0.5 x 93.3 + 0.5 x 94.9 + • 0.5 x 93.3 + 0.5 x 92 + 0 x 88.1 = 232.6 • 0 x 90.3 + 0 x 91.9 + 0 x 90.4 + 0x 87.9 + • 0 x 86.4 + 0 x 89.7 = 0.0 • --------------------------------------------------------------- • ----------------------------------------------------------------
Interaction : Both variables not favourable : weight = 0 One variable not favourable, one variable favourable : weight = 1/8 One variable not favourable, one variable highly favourable : weight = ¼ Both variables favourable : weight = ½ One variable favourable, one variable highly favourable : weight = ¾ Both variables highly favourable : weight = 1
Correlation based weights riw correlation coefficient between Y and i-th weather variable in wth period rii’w correlation coefficient between Y and product of xi and xi’ in wth period
Modified model • Model using both weighted and unweighted indices where
For each weather variable two types of indices have been developed • Simple total of values of weather variable in different periods • Weighted total, weights being correlation coefficients between variable to forecast and weather variable in respective periods • The first indexrepresents total amount of weather variable received by the crop during the period under consideration • The other onetakes care of distribution of weather variable with reference to its importance in different periods in relation to variable to forecast • On similar lines, composite indices were computed with products of weather variables (taken two at a time) for joint effects.
Pigeon pea Phytophthora blight (Kanpur) • Average percent incidence of phytophthora blight at one point of time • Data used : 1985-86 to 1999-2000 on MAXT, MINT, RH1, RH2 and RF (X1- X5) from 28th to 33rd SMW Y = 330.77 + 0.12 Z121 ….. (R2 = 0.77)
Sterility Mosaic • Average percent incidence of sterility mosaic • Data used : 1983-84 to 1999-2000 for MAXT, MINT, RH1, RH2 and RF (X1- X5) from 20th to 32nd SMW Y = -180.41 + 0.09 Z121 …… (R2 = 0.84)
Groundnut Late Leaf Spot & Rust – Tirupathi • Disease indices at one point of time • Data used : MAXT, MINT, RH1, RH2, RF and WS from (X1- X6) - 10th to 14th SMW (Rabi or post rainy) - 41st to 46th SMW (Kharif or rainy)
Models for LSS and Rust Disease Index - groundnut (Tirupati)
Principal component regression • Independent variables large and correlated • Independent variables transformed to principal components • First few principal components explaining desired variation selected • Regression model using principal components as regressors
Discriminant function analysis • Based on disease status years grouped into different categories – low, medium, high • Linear / quadratic discriminant function using weather data in above categories • Discriminant score of weather for each year • Regression model using disease data as dependent variable and discriminant scores of weather as independent. • Data requirement is more. • Can also be used if disease data are qualitative • Johnson et al. (1996) used discriminant analysis for forecasting potato late blight.
Deviation method • Useful when only 5-6 year data available for different periods • Week-wise data not adequate for modeling • Combined model considering complete data. • Not used for disease forewarning but in pest forewarning
Assumption : pest population / disease incidence in particular year at a given point of time composed of two components. • Natural growth pattern • Weather fluctuations • Natural pattern to be identified using data in different periods averaged over years. • Deviation of individual years in different periods from predicted natural pattern to be related with deviations of weather.
Mango • Mango fruitfly – Lucknow (weekly models) • Data used: 1993-94 to 1998-99 on MAXT, MINT and RH – [X1 to X3] • Model for natural pattern t = Week no. Yt = Fruitfly population count at week t
Forecast model Y = 125.766 + 0.665 (Y2) + 0.115 (1/X222 ) + 10.658 (X212) + 0.0013 (Y23) + 31.788 (1/Y3) 21.317 (X12) 2.149 (1/X233) 1.746 (1/X234) Y = Deviation of fruitfly population from natural cycle Yi = Fruitfly population in i-th lag week Xij = Deviation from average of i-th weather variable (i = 1,2,3 corresponds to maximum temperature, minimum temperature and relative humidity) in j-th lag week.
With the development of computer hardware and software and the rapid computerization of business, huge amount of data have been collected and stored in centralized or distributed databases Data is heterogeneous (mixture of text, symbolic, numeric, texture, image), huge (both in dimension and size) and scattered. The rate at which such data is stored is growing at a phenomenal rate. As a result, traditional statistical techniques and data management tools are no longer adequate for analyzing this vast collection of data.
One of the applications of Information Technology that has drawn the attention of researchers is data mining, where pattern recognition, image processing, machine intelligence i.e concerned with the development of algorithms and techniques that allow system to "learn“ are directly related • Data Mining involves • Statistics : Provides the background for the algorithms. • Artificial Intelligence : Provides the required heuristics for learning the system • Data Management : Provides the platform for storage & retrieval of raw and summary data.
Pattern Recognition and Machine Learning principles applied to a very large (both in size and dimension) heterogeneous database for Knowledge Discovery Knowledge Discovery is the process of identifying valid, novel, potentially useful and ultimately understandable patterns in data. Patterns may embrace associations, correlations, trends, anomalies, statistically significant structures etc. Without “Soft Computing” Machine Intelligence and Data Mining may remains Incomplete
Soft Computing Soft Computing is a new multidisciplinary field that was proposed by Dr.LotfiZadeh, whose goal was to construct new generation Artificial Intelligence, known as Computational Intelligence. The concept of Soft Computing has evolved. Dr.Zadeh defined Soft Computing in its latest incarnation as the fusion of the fields of fuzzy logic, neural network, neuro-computing, Evolutionary & Genetic Computing and Probabilistic Computing into one multidisciplinary system. Soft Computing is the fusion of methodologies that were designed to model and enable solutions to real world problems, which are not modeled, or too difficult to model. These problems are typically associated with fuzzy, complex, and dynamical systems, with uncertain parameters. These systems are the ones that model the real world and are of most interest to the modern science.
The main goal of Soft Computing is to develop intelligent system and to solve nonlinear and mathematically unmodelled system problems [Zadeh 1993, 1996, and 1999]. The applications of Soft Computing have two main advantages. First, it made solving nonlinear problems, in which mathematical models are not available, possible. Second, it introduced the human knowledge such as cognition, recognition, understanding, learning, and others into the fields of computing. This resulted in the possibility of constructing intelligent systems such as autonomous self-tuning systems, and automated designed systems.
soft computing tools Soft computing tools include • Fuzzy sets • Fuzzy sets provide a natural frame work for the process in dealing with uncertainty • Artificial neural networks • Neural networks are widely used for modelling complex functions and provide learning and generalization capabilities • Genetic algorithms • Genetic algorithms are an efficient search and optimization tool • Rough set theory • Rough sets help in granular computation and knowledge discovery
Why Neural Networks are desirable Human brain can generalize from abstract Recognize patterns in the presence of noise Recall memories Make decisions for current problems based on prior experience Why Desirable in Statistics Prediction of future events based on past experience Able to classify patterns in memory Predict latent variables that are not easily measured Non-linear regression problems
Application of ANNs Classification: medical diagnosis signature verification character recognition voice recognition image recognition face recognition loan risk evaluation data mining Modelling and Control control systems system identification composing music Forecasting: economic indicators energy requirements medical outcomes crop forecasts environmental risks
Neural networks are being successfully applied across an extraordinary range of problem domains, in areas as diverse as finance, medicine, engineering, geology, biology, physics and agriculture. From a statistical perspective neural networks are interesting because of their potential use in prediction and classification problems. A very important feature of these networks is their adaptive nature, where “Learning by Example” replaces “Programming” in solving problems. Basic capability of neural networks is to learn patterns from examples