240 likes | 248 Views
A study on utilizing Neural Network technology for tornado detection using Mesocyclone Detection Algorithm and Near Storm Environment attributes. The performance evaluation is based on various scalar measures and training methodologies. Comparison with existing methods and feature selection insights are discussed.
E N D
A Neural Network for Detecting and Diagnosing Tornadic Circulations V Lakshmanan, Gregory Stumpf, Arthur Witt University of Oklahoma, National Severe Storms Laboratory, Meteorological Development Laboratory lakshman@ou.edu
Motivation • MDA and NSE developed at NSSL • MDA identifies storm-scale circulations • Which may be precursors to tornadoes • Marzban (1997) developed a NN based on MDA parameters to classify tornadoes • Using 43 cases • Found incorporation of NSE promising • Radar Operations Center wanted us to examine using a MDA+NSE NN operationally. • Extended Marzban’s work to 83 cases • With a few modifications lakshman@ou.edu
MDA and NSE • Mesocyclone Detection Algorithm (MDA) • designed to detect a wide variety of circulations of varying size and strength by analyzing the radial velocity data from a Doppler weather radar • 23 attributes for each circulation • Near Storm Environment (NSE) • Uses analysis grids from the RUC model to derive 245 different attributes. • Full list of attributes used is in the conference pre-prints. lakshman@ou.edu
Scalar Measures of performance • POD = hit / (hit + miss) • FAR = fa / (hit + fa) • CSI = hit / (hit + miss + fa) • HSS = 2*(null * hit - miss * fa) / {(fa+hit)*(fa+null) + (null + miss)*(miss + hit)} • We also report Receiver Operating Characteristic (ROC curves) lakshman@ou.edu
Neural Network • Fully feedforward resilient backpropagation NN • Tanh activation function on hidden nodes • Logistic (sigmoid) activiation function on output node • Error function: weighted sum of cross-entropy and squared sum of all the weights in the network (weight decay) lakshman@ou.edu
Truthing • Ground truth based on temporal and spatial promixity • Done by hand: every circulation was classified. • Look for radar signature 20 minutes before a tornado is on the ground to 5 minutes after. lakshman@ou.edu
NN Training Method • Extract out truthed MDA detections • Normalize the input features • Determine apriori probability thresholds • 13 attributes known to have univariate tendencies and prune the training set • Divide set in the ratio 46:20:34 (train: validate: test) • Bootstrap train/validate sets. lakshman@ou.edu
NN training method (contd.) • Find optimal number of hidden nodes • Beyond which validation cross-entropy error increases • Choose as warning threshold the threshold at which the output of NN on validation set has maximum HSS. lakshman@ou.edu
Our method vs. Marzban and Stumpf • Slightly different from Marzban/Stumpf: • Error criterion different • Weight decay • Error minimization method different • RProp vs SCG • Bootstrapped case-wise instead of pattern-wise • Automatic pruning based on apriori prob. lakshman@ou.edu
43-case comparison • So, we compared against the same 43-cases (with same independent test cases) • Most of the difference due to better generalization • case-wise bootstrapping lakshman@ou.edu
MDA NN (83 case) • 43 case data set used by Marzban were large/tall/strong • Rather easy dataset of tornado detection • The next 40 cases more atypical • Mini-supercells, squall-line tornadoes, tropical events etc. • Manually selected independent 27 cases to have similar distribution of strong and weak tornadoes. • Remaining 56 cases used to verify network. • Then, use all 83 cases to create “operational” network. lakshman@ou.edu
83 case MDA NN • The performance of best network on independent test case of 27 compared with results on 43-case. • And performance of best network trained using all 83 cases (no independent test case) lakshman@ou.edu
83 case MDA NN • ROC curves for 27-case independent test lakshman@ou.edu
MDA + NSE • Statistics of the dataset change dramatically when we add NSE parameters as inputs • 10x as many inputs, so chances of over-fitting much greater. • NSE parameters not tied to individual detections • NSE parameters highly correlated in space and time. • NSE parameters not resolved to radar resolution (20kmx20km vs. 1kmx1km) • NSE parameters available hourly; radar data every 5-6 minutes. lakshman@ou.edu
Feature Selection • Reduce parameters from 245 to 76 based on meteorological understanding. • Remove one attribute of highly correlated pairs (Pearson’s correlation coefficient). • Take the top “f” fraction of univariate predictors lakshman@ou.edu
Choose most general network • Variation of the neural network training and validation errors as the number of input features is increased. • Choose the number of features where generalization error is minimum (f=0.3) lakshman@ou.edu
MDA+NSE • On independent 27-case set. lakshman@ou.edu
MDA+NSE (27-case set) lakshman@ou.edu
Generalization • Similar HSS scores on training, validation and independent test data sets. • In MDA+NSE, we sacrificed higher performance to get better generalization lakshman@ou.edu
Is NSE information helpful? • NSE parameters changed the statistics of the data set • The MDA+NSE neural network is only marginally better than a MDA NN but: • NSE information has the potential to be useful. • We used only 4 of the 76 of the 245 features! lakshman@ou.edu
Going further • Where can we go further with this approach? • Find better ways to reduce the number of features • Use time history of detections • Generate many more data cases. • All of which will yield very little (we believe). lakshman@ou.edu
Spatio-temporal Tornado Guidance • Formulate the tornado prediction problem differently. • Instead of devising a machine intelligence approach to classify detections • Spatio-temporal: of estimating the probability of a tornado event at a particular spatial location within a given time window lakshman@ou.edu
Spatio-temporal approach • Our initial approach: • Modify ground truth to create spatial truth field • use a least-squares methodology to estimate shear • morphological image processing to estimate gradients, • fuzzy logic to generate compact measures of tornado possibility • a classification neural network to generate the final spatio-temporal probability field. • Past and future history, both of observed tornadoes and of the candidate regions, is obtained by tracking clustered radar reflectivity values • integrate data from other sensors (e.g: numerical models and lightning). • Paper at the IJCNN 2005 lakshman@ou.edu
Acknowledgements • Funding for this research was provided under NOAA-OU Cooperative Agreement NA17RJ1227 and supported by the Radar Operations Center. • Caren Marzban and Don Burgess, both of the University of Oklahoma, helped us immensely on the methods and attributes used in this paper lakshman@ou.edu