1 / 33

NPS

NPS. Robert DeFeo, chief horticulturalist for the National Park Service, is responsible for predicting when the cherry blossoms bloom. He has been making predictions and recording observations since 1992

guang
Download Presentation

NPS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NPS • Robert DeFeo, chief horticulturalist for the National Park Service, is responsible for predicting when the cherry blossoms bloom. • He has been making predictions and recording observations since 1992 • Defined and recorded 6 phases of development: Green Color in Buds, Florets Visible, Extension of Florets, Peduncle Elongation, Puffy White, Peak Blooms

  2. NPS • DeFeo attempts to predict when the trees will reach peak bloom so that this will occur during the Cherry Blossom Festival • Default prediction is April 4th • He announces his prediction 2 weeks before the bloom

  3. NPS • What factors does DeFeo consider? • Bloom of other plants • High and low temperature • Photoperiod

  4. Collecting the Data • DeFeo provided bloom dates dating back to 1921 and dates of 6 stages back to 1992

  5. Collecting the Data • Weather data was obtained from the Weather Underground site • Data was pulled using screen scraping • Weather data started in 1948 • Incomplete data for 1948, 1994, 1995, 1996, 2000, and 2001.

  6. Data Collected

  7. Visualizing the Data • Year and average temperature in March

  8. Visualizing the Data • Year and Bloom Date

  9. Visualizing the Data • Bloom and temperature

  10. What can we learn from this? • Data “looks” linear • Strong correlation between temperature and when the peak bloom occurs

  11. Heuristics • GDD • Given by the equation: • (Thigh + Tlow)/2 - Tbaseline • Calculated accumulated GDD using values from 0°F to 60°F at 10° increments • Used linear regression • Found 0°F produced best adjusted R2 value.

  12. Calculated the accumulated GDD from Jan 1 to bloom date • Created program that reserved records for 26 of 53 years • Performed linear regression • Calculated RMSE on cross-validation set

  13. Heuristics • Regression: -0.0242095469666847x+88.8862936319553 • Total error: 7.2 • March 1st error: 5.7 • March 15th error: 5.8 • Bloom Date error: 8.5

  14. Heuristics • Used linear regression on same data but excluded January and February from regression model • Regression:-0.0216093918184858x+84.1086368228915 • Total error: 5.9 • March 1st error: 5.9 • March 15th error: 6.3 • Bloom Date error: 5.6

  15. Heuristics • Used linear regression on same data but excluded January from regression model • Regression: -0.0217918555433408x+85.0674318813992 • Total error: 6.2 • March 1st error: 6.6 • March 15th error: 6.2 • Bloom Date error: 5.9

  16. Heuristics • Recalculate GDD excluding without January and February • Regression: -0.0189188233985223x+33.4894170814518 • Total error: 5.7 • March 1st error: 6.9 • March 15th error: 6.4 • Bloom Date error: 6.4

  17. Calculate GDD beginning February 1st • Regression: -0.022215061163981x+61.0588653664869 • Total error: 5.8 • March 1st error: 6.1 • March 15th error: 5.8 • Peak Bloom error: 5.2

  18. Heuristics • Calculate GDD beginning February 1st. Create regression model starting March 1st. • Regression: -0.021961372553719x+59.6303409305679 • Total error: 5.5 • March 1st error: 5.3 • March 15th error: 5.0 • Peak Bloom error: 4.8

  19. Heuristics • Use GDD with ANN • Use accumulated GDD since January 1st as input • Preprocessed data to create a single lag-file for all the years • Processed data using CortexPro Neural Networks tool, v.5.0 • Days till bloom is output

  20. Total RMSE: 6.7

  21. Heuristics • Use average temperature as indicator of bloom date • Use linear regression on average temperature in March • Regression: -1.1550140341924x+147.033044143914 • RMSE: 4.8 • Use linear regression on average temperature of first 15 days in March • Regression:-0.505593057443286x+115.921494363343 • RMSE: 6.0

  22. Heuristics • Use average bloom date (April 4th) as prediction. • RMSE: 6.5

  23. Conclusions • Utility of model varies depending upon data available • While DeFeo’s model is accurate, powerful models were created that do not rely on direct observation of data • Models were “good enough” to fall into timespan of festival

  24. Future Work • The models created can be refined as the knowledge base grows • Include a standard measure of error for all models • Include photoperiod as a factor • Incorporate electronic GDD recordings • Include image data with pattern recognition

More Related