130 likes | 199 Views
14 January 2009. 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction. Bill Myers Seth Linden, Gerry Wiener. Project Overview and Goals. Improve soil temperature and moisture prediction Integrate and Evaluate NASA-MODIS data sets
E N D
14 January 2009 2009 AMS Artificial Intelligence ConferenceA Data Mining Approach to Soil Temperature and Moisture Prediction Bill Myers Seth Linden, Gerry Wiener
Project Overview and Goals • Improve soil temperature and moisture prediction • Integrate and Evaluate NASA-MODIS data sets • Leaf Area Index (LAI) • Green Vegetation Fraction (i.e. FPAR) • Albedo • Deliver tailored products to end users • Soil forecasts will drive Agriculture-specific models (e.g. pest models) • RAL partnered with DTN/Meteorlogix • DTN DSS delivers Ag-specific forecasts to 80,000 users
Solar Energy Weather Subsurface Nodes Fixed Node Soil State Prediction • Current soil state modified by atmospheric forcing conditions • Heat and moisture are transferred between adjacent nodes • Typically done with a physical model, called a Land-Surface Model (LSM)
Physical Model • This project uses the High Resolution Land Data Assimilation System and the Noah LSM • Used by NCEP as part of the NAM (WRF model) • Many parameters are necessary to model soil type and land surface characteristics • Affect incident solar energy, heat transfer, etc • Parameters must be generalized • “Sandy loam” will have same parameterization at all sites • Chemical compositions of “sandy loam” differ between sites • Heat and moisture transfer will not be exact at ANY site • Goal of this study: Determine if a data mining approach can produce results comparable to those of the physical model
Data Mining System • Regression Tree (Cubist) • Available from www.rulequest.com • Looks for patterns in data • Builds rule-based numerical models • Rules are developed based on training data • At each leaf node, a regression equation is developed that best fits that subset of the training data • Effectively, linear approximations are being made when certain conditions are met • Soil state forecasts are generated by applying rule set to forecast data • Training Data • 29 Soil Climate Analysis Network (SCAN) sites • Two years of observational history at each site used to develop rules • NCAR scientists were consulted to determined most important inputs to soil state evolution • These were extracted or derived from observed variable set
Regression Tree Model Generation • 10 Regression trees were developed for each site • One regression tree for soil temperature and soil moisture at each depth (5, 10, 20, 50, 100 cm) • Input variables: • Julian day • Air Temperature • Delta air temperature (in current hr) • Downward Shortwave Radiation • Wind Speed • Dew point temperature • Precip amt • Previous soil state: • Previous hour’s soil temperature and moisture at adjacent depths • A target variable (e.g. Current Soil Temp at 5 cm) was provided with each hour’s data
Example training data • | Names file for 5cm temperature prediction • ST5_curr | Predictand in list of variables below • siteID: ignore | SCAN site ID • date: ignore | YYYYMMDDHH • mon: continuous | fraction of Julian year • AirT: continuous | 2m air temp (avg over last hr) • deltaT: continuous | air temp change over last hour • dsw: continuous | avg downward shortwave radiation over last hr • wspd: continuous | avg wind speed over last hour • TD: continuous | avg dew point temp over last hour • qpf: continuous | precip amt over last hour • ST5_prev: continuous | 5 cm soil temp at previous hour • ST10_prev: continuous | 10 cm soil temp at previous hour • SM5_prev: continuous | 5 cm soil moisture at previous hour • SM10_prev: continuous | 10 cm soil moisture at previous hour • ST5_curr: continuous | 5 cm soil temp at previous hour Sample line of training data 2001, 2007110211, 0.9167, 4.53, -0.89, 0.00, 2.81, -3.28, 0.00, 8.158, 9.847, 33.858, 39.616, 8.32 Time of year Wind Speed No Precip Previous hour’s soil moisture at 5 cm and 10cm Air Temp Dewpoint Temp Current hour’s 5 cm Soil T (Predictand) Previous hour’s soil temperature at 5 cm and 10cm Air Temp Falling in this hour No downward Radiation (night)
Rules Development and Application • Regression Trees generated for each predictand at each site • Separate tree for Soil Temperature and Moisture at each depth • Two years of training data for most sites • Example rule and associated regression: if dsw <= 0.09 and ST5_prev > 12.05 ST5_curr = -0.211 + 0.3165 dsw + 0.83 ST5_prev + 0.13 ST10_prev + 0.02 AirT + 0.02 TD • 48 hour forecasts were generated iteratively • Starting with observed soil state and first hour’s weather predictions • Regression trees were applied for each predictand to generate forecast state at hour 1 • Using the forecast soil state and weather predictions, the next hours’ forecasts were generated iteratively • Soil forecasts generated for 2007 growing season (April-June) • Data Mining and HRLDAS forecasts were compared to observations
Results • Statistically, data mining better than HRLDAS at nearly all the 29 SCAN sites • Median (and quartile) MAEs significantly lower for data mining • Data mining errors generally 30%+ lower than HRLDAS errors
Summary • Data mining with Cubist Regression Trees • Reduces soil temperature and moisture errors • Simple to develop rules • Rules/Regressions can be displayed easily • Regression Tree forecasts tuned to the site • HRLDAS forecast parameters are more generic • Applicability to non-observing sites • Rules, as developed are site specific • Not valid away from that location • HRLDAS can generate forecasts at any location • Observing sites do not begin to cover all land use and soil type combinations
Future Directions • Add vegetation state (from NASA MODIS data) to data mining training sets to determine see these results can be improved upon • Train Cubist with all obs sites lumped together but include land use and soil type as input variables • Investigate combining data mining approach and LSM to get best of both
Acknowledgements • This research effort has been supported by a NASA-ROSES grant. • We appreciate the help provided by personnel at the USDA Natural Resources Conservation Service, and various NASA labs. • Soil forecast web site: • www.rap.ucar.edu/projects/nasa-ag/ • hrldas/display_hrldas_animation.html • Cubist is available at www.rulequest.com