320 likes | 447 Views
Real-Time Mining of Integrated Weather Information Setup meeting (Aug. 30, 2002) lakshman@ou.edu. NSF Medium ITR. Goals. Develop dynamic data mining applications (wherein information is extracted and provided to forecasters in real-time).
E N D
Real-Time Mining of Integrated Weather Information Setup meeting (Aug. 30, 2002) lakshman@ou.edu NSF Medium ITR
Goals • Develop dynamic data mining applications (wherein information is extracted and provided to forecasters in real-time). • Develop applications of radar data to identify severe weather signatures in a probabilistic manner. • Build a prototype system so that these applications can be developed and tested on real-time and on archived data sets.
Tasks • Projects • Dual-polarization algorithms • Clustering and Prediction • Vortex Identification • Areas of IT research • SVMs (identification & prediction), multivariate feature identification techniques, probabilistic feature extraction, high performance issues • I will talk about the tasks: we can decide the applicable areas as a group.
Funding • Funded at 300K for the first year. • May get $650K over the next two years. • We need to show results at the end of the year, so it is good to know what the reviewers liked and did not like about our proposal.
Negative Reviews (NSF) • Unfocused • No high-performance computing or numerical simulations • Real-time not explicitly defined • Budget way too high • No human-factors expertise • No details of how the approach could solve the problem.
Reviewers liked these • Develop sensor compensation techniques for faulty sensors • Strong application focus on a complex domain • Experience with disseminating systems and WSR-88D algorithms • We seem to have been funded based on what we have done before, rather than on the merits of this particular proposal.
From the 6th reviewer • Extend their previous working system (WDSS) with the following features: • integrating multiple sources of data • learning in real-time, thus improving the prediction capabilities • using statistics-based instead of heuristics-based decisions. • Use of these methodologies for teaching purposes, as well as the dissemination of this software to other research laboratories and the creation of a common research tool
Also from the 6th reviewer • Could have been improved: • the proposal seems to be an enumeration of different techniques, without any justification of why these methods have been chosen instead of other ones. • detailed explanations are sometimes missing. • My recommendation is to fund this proposal, but at a lower level than the one proposed by the investigators.
Tasks • Three tasks: • Vortex Detection • Clustering and prediction • Polarimetric Radar
Real-Time • Classical: data periodicity (keep up with data). • Hard to define for multi-sensor applications • If you have a 3-radar domain, with a new elevation scan every 30 seconds, you get a new updated virtual volume on average every 10 seconds. Is periodicity 10 seconds? • Lightning strikes are essentially asynchronous. • Proposed: based on required lead-time • Example: average lead-time for a tornado warning is 11 minutes. We could set as a goal, predicting tornadoes 20 minutes into the future. If we can do it with data from 30 minutes ago, then, we have 10 minutes to process data. • Keep mind that the forecasts have to be continuous. We have to make runs once every 10 minutes.
Task 1: Vortex Detection • At the end of this year, aim to have a vortex identification and prediction technique that: • Uses data from multiple sensors • Uses some novel data (more on this follows) • Accomodates for faulty information • Is capable of better skill than MDA/TDA • Is capable of providing more lead-time to a forecaster. • Decision Support System: provide forecaster with rationale for all suggested decisions.
Current MDA/TDA • Mesocylone detection technique • find 2D detections by analyzing azimuthal shear • associate them based on rank and time into 3D circulation features if they meet some strength thresholds • 3D circulations that meet depth, base and strength criteria are classified as mesocyclones.
Problems with current vortex algorithms • Defined on radial velocity field. • Single radar • Simple use of radar reflectivity (>0 dBZ) • Mesocyclone spatial extent based on radial velocity values, which are noisy • How can we improve it?
Use of LLSD • One promising source of data is a linear least-squares fit of radial velocity in the neighborhood of a gate. • The size of the neighborhood depends on the range from the radar. • Fit to a linear combination of azimuth and range • Coefficient for azimuth is an estimate of azimuthal shear • Coefficient for range is the divergence.
LLSD usage • Azimuthal shear field
Boundaries • Tornadoes frequently happen at the boundaries between air masses • Not necessary • Image shows dry-line boundary • Image processing for boundaries to detect gust-fronts would be useful.
Input Sources • The LLSD has never been used in vortex detection. Unlike the raw radial velocity, it can be combined from multiple radar. • Also have satellite data from spatial domain • Have national/region lightning data. • The Near Storm Environment (RUC model) • Still need to assimilate LLSD and reflectivity data from multiple radar in a fault-tolerant manner. (Can now do fault-tolerant time-based merges).
Learning • Add a learning component • Incorporate warnings issued by forecaster into the learning by the algorithm. • Warnings can be faulty. Different forecasters have different skills. Therefore, this has to be achieved by the algorithm learning on the fly. • Validate the algorithm against storm reports. The verification data is noisy. Have to come up with robust ways of doing this verification.
Data: status • The WDSS-II system already ingests radar data from multiple radars and national/regional lightning data. • Work is underway to ingest satellite data in real-time (archived cases can be done already). • We have archived warnings and RUC data since April of this year. • Currently testing process to compute LLSD at different scales. • RUC model data needs to be ingested.
Discussion • What kinds of techniques are appropriate for vortex detection? • Multiple-sensor reflectivity, LLSD • RUC model data (in Lambert projection) • Multivariate analysis • Gust-front detection
Task 2: Clustering and Prediction • Currently there are two ways to identify storms: • Heuristic threshold-based technique that operates on radial reflectivity field. • Texture segmentation method. • Once identified, the storms are predicted by: • Matching centroids of storms identified and linear extrapolation • Find motion estimate by minimizing mean absolute error on actual field. Then, forecast.
SCIT / kmeans • The centroid and threshold-based technique called SCIT (storm cell identification and tracking) is used on the WSR-88D. • The texture segmentation and error-field minimization technique is being worked on. • I will show the results from the second technique because the first technique predicts only centroid location. (We want to do field forecasts).
Kmeans • These clusters are actually found at different scales. • The clusters are used as the domain within which the error minimization done (the kernel that is moved around in the previous frame). • And using these, a motion estimate (“wind field”) is obtained at different scales.
Performance • Compared to a persistence forecast. • Skill at predicting the location of 30dBZ or higher values. • Clutter at the end of sequence. (Random data are assigned motion estimates)
Ideas for future work • Drawbacks with current approach: • Operates on radar or on satellite, not on both. • Can not handle faulty data (as with clutter) • Use multiple inputs in deriving motion estimates: • Storm core movement (as the technique does) • Dual-doppler wind field retreival (?) • Wind-field estimates from mesoscale model (RUC)
Discussion • Why go through wind-field estimate (and not directly to a forecast)? • To allow forecast of fields other than the input. • Physically reasonable assimilation. • Better ways of identifying storms. • Better ways of predicting location and values (field forecast).
Task 3: Polarimetric Radar Algorithms • Essentially open field for research. • Currently only one AI algorithm: a hydrometeor classification algorithm. • Low-hanging fruit: a hail-size estimation technique.
Hail Size Estimation • Currently done on Doppler radar (algorithm to compute field of hail size estimates in WDSS-II already). • High reflectivity data aloft are assumed to produce hail. • Polarimetric radar provides way of identifying hail near the surface (via aspect ratio). • Come up with way to estimate hail size.
Learning • Train the technique on actual hail reports (which are noisy). • Problems with polarimetric radar include calibration errors. Techniques have to account for this. • Use the polarimetric hail-size estimation technique to improve the predicted hail-size from the Doppler-based method.
Contacts • People at CIMMS/NSSL who can advise on each of these tasks: • Vortex Detection: Greg Stumpf, Travis Smith • greg.stumpf@noaa.gov • travis.smith@noaa.gov • Clustering/Prediction: V Lakshmanan, Bob Rabin • v.lakshmanan@noaa.gov • rabin@nssl.noaa.gov • Polarimetric Radar: Terry Schuur • terry.schuur@noaa.gov