1 / 33

The 2008 Artificial Intelligence Competition

The 2008 Artificial Intelligence Competition. Valliappa Lakshmanan National Severe Storms Laboratory & University of Oklahoma Elizabeth E. Ebert Bureau of Meteorology Research Center, Australia Sue Ellen Haupt Penn State University, State College, PA.

lmanrique
Download Presentation

The 2008 Artificial Intelligence Competition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The 2008 Artificial Intelligence Competition Valliappa Lakshmanan National Severe Storms Laboratory & University of Oklahoma Elizabeth E. Ebert Bureau of Meteorology Research Center, Australia Sue Ellen Haupt Penn State University, State College, PA Sponsored by Weather Decision Technologies lakshman@ou.edu

  2. Why a competition? • AI committee organizes: • Conference with papers • Tutorial session before conference (every 2 years) • The tutorial sessions are very popular, but: • Gets repetitive • Same set of techniques presented too often • Often by same speakers! • Not clear what the differences are • Different datasets, etc. • Can I not just use a machine intelligence or neural network toolbox? • Purpose of competition is to replace tutorial but provide learning experience • Same dataset, different techniques • Competitive aspect is just a sideshow – don’t put too much stock into it! lakshman@ou.edu

  3. The 2008 Artificial Intelligence Competition Dataset Results lakshman@ou.edu

  4. Project 1: Skill Score By Storm Type • Try to answer this question (posed by Travis Smith) • Very critical, but hard to answer based on current knowledge • Is it the type of weather or is it the forecaster skill? • Initially, concentrate on tornadoes • Based on radar imagery, classify the type of storms at every time step • Take NWS warnings and ground truth information for a lot of cases • Compute skill scores by type of storm • Summer REU project • Eric Guillot, Lyndon State • Mentors: Travis Smith, Don Burgess, Greg Stumpf, V Lakshmanan Does the skill score of a forecast office as evaluated by the NWS depend on the type of storms that the NWS office faced that year? lakshman@ou.edu

  5. Project 2: National Storm Events Database • Build a national storm events database • With high-resolution radar data combined from multiple radars • Derived products • Support spatiotemporal queries • Collaboration between NSSL, NCDC and OU (CAPS, CSA) lakshman@ou.edu

  6. Approach • Project 1: How to get classify lots and lots of radar imagery? • Need automated way to identify storm type • Technique: • Cluster radar fields • Extract storm characteristics for each cluster • Associate storm characteristics to human-identified storm type • Train learning technique (NN/decision tree) to do this automatically • Let it loose on entire dataset • Project 2: How to support spatiotemporal queries on radar data? • Can create polygons based on thresholding data • But need to tie together different data sources • Need automated way to extract storm characteristics for querying lakshman@ou.edu

  7. WDSS-II CONUS Grids • In real-time, combine data from 130+ WSR-88Ds • Reflectivity and azimuthal shear fields • Use these to derive products: • Reflectivity Composite • VIL • Echo top heights • Hail probability (POSH), Hail size estimates (MESH), etc. • Low-level, mid-level shear • Many others (90+) • Have the 3D reflectivity and shear products archived • Can use these to recreate derived products lakshman@ou.edu

  8. Cluster Identification Using Kmeans • Hierarchical clustering using texture segmentation and K-means clustering • Lakshmanan, V., R. Rabin, and V. DeBrunner, 2003: Multiscale storm identification and forecast. J. Atm. Res., 67, 367-380 • Technique yields 3 different scales of clustering • Chose D to train the decision tree • Cluster attributes at 420 km^2 (scale D) used for our study lakshman@ou.edu

  9. Manual Storm Classification • Manually classified over 1,000 storms over three days worth of data (March 28th, May 5th, and May 28th of 2007). • Used all the fields ultimately available to automated algorithm • VIL, POSH, MESH, Rotation Tracks, etc. • Available in real-time at http://wdssii.nssl.noaa.gov/ over entire CONUS lakshman@ou.edu

  10. Hail Case (Apr. 19, 2003; Kansas) Reflectivity Composite from KDDC, KICT, KVNX and KTWX lakshman@ou.edu

  11. Echo Top Height of echo above 18 dBZ lakshman@ou.edu

  12. MESH Maximum expected size of hail lakshman@ou.edu

  13. VIL Vertical Integrated Liquid lakshman@ou.edu

  14. Cluster Table • Each identified cluster has these properties: • ConvectiveArea in km^2 • MaxEchoTop and LifetimeEchoTop • MESH and LifetimeMESH • MaxVIL, IncreaseInVIL and LifetimeMaxVIL • Centroid, LatRadius, LonRadius, Orientation of ellipse fitted to cluster • MotionEast, MotionSouth in m/s • Size in km^2 • One set of clusters per scale • We used only the 420km^2 cluster lakshman@ou.edu

  15. Controlling the Cluster Table • Can choose any gridded field for output • From gridded field, can compute the following statistics within cluster • Minimum value, Maximum value • Average, Standard deviation • Area within interval (Useful to create histograms) • Increase in value temporally • Does not depend on cluster association being correct • Computed image-to-image • Lifetime maximum/minimum • Depends on cluster association being correct, so better on larger clusters lakshman@ou.edu

  16. Input Parameters Continued on next slide lakshman@ou.edu

  17. Input Parameters (contd.) lakshman@ou.edu Continued on next slide

  18. Input Parameters (contd.) lakshman@ou.edu

  19. Types of Storms • Four categories: • Not organized • Isolated supercell • Convective lines • Includes lines with embedded supercells • Pulse storms lakshman@ou.edu

  20. Decision Tree Training • Trained decision tree using manually classified storms in order to develop a logical process for automatically classifying them • Tested this decision tree on three additional cases (April 21st of 2007, and May 10th and 14th of 2006) • TSS=0.58; good enough for NWS study to continue lakshman@ou.edu

  21. Decision Tree • Why decision tree? • Didn’t know whether the dataset was tractable • Wanted to be able to analyze resulting “machine” • Make sure extracted rules were reasonable lakshman@ou.edu

  22. The 2008 Artificial Intelligence Competition Dataset Results lakshman@ou.edu

  23. Entries • Received 6 official, and one unofficial, entry by competition deadline • Unofficial entry not accompanied by abstract or AMS manuscript • Neil Gordon (Met Service, New Zealand): random forest • Not eligible for prize, but included in comparisons • Official Entries: • John K. Williams and Jenny Abernathy: random forests and fuzzy logic • Ron Holmes: neural network • David Gagne and Amy McGovern: boosted decision tree • Jenny Abernathy and John Williams: support vector machines • Luna Rodriguez: genetic algorithms • Kimberly Elmore: discriminant analysis and support vector machines lakshman@ou.edu

  24. Truth Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy Distribution of storm categories lakshman@ou.edu

  25. Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 0 (Not severe) lakshman@ou.edu

  26. Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 1 (Isolated supercell) lakshman@ou.edu

  27. Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 2 (Convective line) lakshman@ou.edu

  28. Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 4 (Pulse storm) lakshman@ou.edu

  29. Similarity matrix - % of identical classifications among entries lakshman@ou.edu

  30. Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy Statistical results – True Skill Statistic Joint First Third lakshman@ou.edu

  31. Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy Statistical results – Accuracy and Heidke Skill Score lakshman@ou.edu

  32. Acknowledgements • Thanks to: • Weather Decision Technologies for sponsoring the prizes • The AMS probability and statistics committee • For loaning us Beth Ebert’s expertise • All the participants for entering competition and explaining methodology • Can be hard to find time to do “extra-curricular” work • Very grateful that you could enter this competition lakshman@ou.edu

  33. Where to go from here? • Please share with us your thoughts and suggestions • Is such a competition worth doing? • Was this session a learning experience? • How can it be improved in the future? • Is there something that you would have done differently? Why? • Our thoughts: • Classification is not the only aspect of machine intelligence • Estimation, association finding, knowledge capture, clustering, … • Perhaps a future competition could address one of these areas • Address another aspect of AMS besides short-term severe weather lakshman@ou.edu

More Related