The 2008 Artificial Intelligence Competition

The 2008 Artificial Intelligence Competition Valliappa Lakshmanan National Severe Storms Laboratory & University of Oklahoma Elizabeth E. Ebert Bureau of Meteorology Research Center, Australia Sue Ellen Haupt Penn State University, State College, PA Sponsored by Weather Decision Technologies lakshman@ou.edu

Why a competition? • AI committee organizes: • Conference with papers • Tutorial session before conference (every 2 years) • The tutorial sessions are very popular, but: • Gets repetitive • Same set of techniques presented too often • Often by same speakers! • Not clear what the differences are • Different datasets, etc. • Can I not just use a machine intelligence or neural network toolbox? • Purpose of competition is to replace tutorial but provide learning experience • Same dataset, different techniques • Competitive aspect is just a sideshow – don’t put too much stock into it! lakshman@ou.edu

The 2008 Artificial Intelligence Competition Dataset Results lakshman@ou.edu

Project 1: Skill Score By Storm Type • Try to answer this question (posed by Travis Smith) • Very critical, but hard to answer based on current knowledge • Is it the type of weather or is it the forecaster skill? • Initially, concentrate on tornadoes • Based on radar imagery, classify the type of storms at every time step • Take NWS warnings and ground truth information for a lot of cases • Compute skill scores by type of storm • Summer REU project • Eric Guillot, Lyndon State • Mentors: Travis Smith, Don Burgess, Greg Stumpf, V Lakshmanan Does the skill score of a forecast office as evaluated by the NWS depend on the type of storms that the NWS office faced that year? lakshman@ou.edu

Project 2: National Storm Events Database • Build a national storm events database • With high-resolution radar data combined from multiple radars • Derived products • Support spatiotemporal queries • Collaboration between NSSL, NCDC and OU (CAPS, CSA) lakshman@ou.edu

Approach • Project 1: How to get classify lots and lots of radar imagery? • Need automated way to identify storm type • Technique: • Cluster radar fields • Extract storm characteristics for each cluster • Associate storm characteristics to human-identified storm type • Train learning technique (NN/decision tree) to do this automatically • Let it loose on entire dataset • Project 2: How to support spatiotemporal queries on radar data? • Can create polygons based on thresholding data • But need to tie together different data sources • Need automated way to extract storm characteristics for querying lakshman@ou.edu

WDSS-II CONUS Grids • In real-time, combine data from 130+ WSR-88Ds • Reflectivity and azimuthal shear fields • Use these to derive products: • Reflectivity Composite • VIL • Echo top heights • Hail probability (POSH), Hail size estimates (MESH), etc. • Low-level, mid-level shear • Many others (90+) • Have the 3D reflectivity and shear products archived • Can use these to recreate derived products lakshman@ou.edu

Cluster Identification Using Kmeans • Hierarchical clustering using texture segmentation and K-means clustering • Lakshmanan, V., R. Rabin, and V. DeBrunner, 2003: Multiscale storm identification and forecast. J. Atm. Res., 67, 367-380 • Technique yields 3 different scales of clustering • Chose D to train the decision tree • Cluster attributes at 420 km^2 (scale D) used for our study lakshman@ou.edu

Manual Storm Classification • Manually classified over 1,000 storms over three days worth of data (March 28th, May 5th, and May 28th of 2007). • Used all the fields ultimately available to automated algorithm • VIL, POSH, MESH, Rotation Tracks, etc. • Available in real-time at http://wdssii.nssl.noaa.gov/ over entire CONUS lakshman@ou.edu

Hail Case (Apr. 19, 2003; Kansas) Reflectivity Composite from KDDC, KICT, KVNX and KTWX lakshman@ou.edu

Echo Top Height of echo above 18 dBZ lakshman@ou.edu

MESH Maximum expected size of hail lakshman@ou.edu

VIL Vertical Integrated Liquid lakshman@ou.edu

Cluster Table • Each identified cluster has these properties: • ConvectiveArea in km^2 • MaxEchoTop and LifetimeEchoTop • MESH and LifetimeMESH • MaxVIL, IncreaseInVIL and LifetimeMaxVIL • Centroid, LatRadius, LonRadius, Orientation of ellipse fitted to cluster • MotionEast, MotionSouth in m/s • Size in km^2 • One set of clusters per scale • We used only the 420km^2 cluster lakshman@ou.edu

Controlling the Cluster Table • Can choose any gridded field for output • From gridded field, can compute the following statistics within cluster • Minimum value, Maximum value • Average, Standard deviation • Area within interval (Useful to create histograms) • Increase in value temporally • Does not depend on cluster association being correct • Computed image-to-image • Lifetime maximum/minimum • Depends on cluster association being correct, so better on larger clusters lakshman@ou.edu

Input Parameters Continued on next slide lakshman@ou.edu

Input Parameters (contd.) lakshman@ou.edu Continued on next slide

Input Parameters (contd.) lakshman@ou.edu

Types of Storms • Four categories: • Not organized • Isolated supercell • Convective lines • Includes lines with embedded supercells • Pulse storms lakshman@ou.edu

Decision Tree Training • Trained decision tree using manually classified storms in order to develop a logical process for automatically classifying them • Tested this decision tree on three additional cases (April 21st of 2007, and May 10th and 14th of 2006) • TSS=0.58; good enough for NWS study to continue lakshman@ou.edu

Decision Tree • Why decision tree? • Didn’t know whether the dataset was tractable • Wanted to be able to analyze resulting “machine” • Make sure extracted rules were reasonable lakshman@ou.edu

The 2008 Artificial Intelligence Competition Dataset Results lakshman@ou.edu

Entries • Received 6 official, and one unofficial, entry by competition deadline • Unofficial entry not accompanied by abstract or AMS manuscript • Neil Gordon (Met Service, New Zealand): random forest • Not eligible for prize, but included in comparisons • Official Entries: • John K. Williams and Jenny Abernathy: random forests and fuzzy logic • Ron Holmes: neural network • David Gagne and Amy McGovern: boosted decision tree • Jenny Abernathy and John Williams: support vector machines • Luna Rodriguez: genetic algorithms • Kimberly Elmore: discriminant analysis and support vector machines lakshman@ou.edu

Truth Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy Distribution of storm categories lakshman@ou.edu

Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 0 (Not severe) lakshman@ou.edu

Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 1 (Isolated supercell) lakshman@ou.edu

Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 2 (Convective line) lakshman@ou.edu

Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Not severe Isolated supercell Convective line Pulse storm Holmes Rodriguez Williams & Abernethy Gordon Classifications for observed class 4 (Pulse storm) lakshman@ou.edu

Similarity matrix - % of identical classifications among entries lakshman@ou.edu

Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy Statistical results – True Skill Statistic Joint First Third lakshman@ou.edu

Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy Statistical results – Accuracy and Heidke Skill Score lakshman@ou.edu

Acknowledgements • Thanks to: • Weather Decision Technologies for sponsoring the prizes • The AMS probability and statistics committee • For loaning us Beth Ebert’s expertise • All the participants for entering competition and explaining methodology • Can be hard to find time to do “extra-curricular” work • Very grateful that you could enter this competition lakshman@ou.edu

Where to go from here? • Please share with us your thoughts and suggestions • Is such a competition worth doing? • Was this session a learning experience? • How can it be improved in the future? • Is there something that you would have done differently? Why? • Our thoughts: • Classification is not the only aspect of machine intelligence • Estimation, association finding, knowledge capture, clustering, … • Perhaps a future competition could address one of these areas • Address another aspect of AMS besides short-term severe weather lakshman@ou.edu

The 2008 Artificial Intelligence Competition

The 2008 Artificial Intelligence Competition

Presentation Transcript

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

The 2008 Artificial Intelligence Competition

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008

Artificial Intelligence

Artificial Intelligence

ARTIFICIAL INTELLIGENCE

Artificial Intelligence

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008

How Artificial Intelligence Improve Gymnastic Competition