150 likes | 272 Views
Topic 4: Spatial forecast verification. Dave Ahijevych Randy Bullock Jason Nachamkin Sukanta Basu Beth Ebert Bill Gallus. Tom Hamill Mike Baldwin Efi Foufoula-Giorgiou Scott Sandgathe Barbara Casati Mike Kay Eric Gilleland Barbara Brown. Participants.
E N D
Dave Ahijevych Randy Bullock Jason Nachamkin Sukanta Basu Beth Ebert Bill Gallus Tom Hamill Mike Baldwin Efi Foufoula-Giorgiou Scott Sandgathe Barbara Casati Mike Kay Eric Gilleland Barbara Brown Participants
1. Do the traditional continuous and categorical scores have a place in verifying high resolution model forecasts? If so, how should they best be used? • Yes, they will always have a role • Some users – including model developers – use this information (e.g., Bias) • Also can evaluate traditional scores at different scales, to understand performance as a function of scale (also could average across different scales) • Provides a good foundation and baseline (i.e., continue a historical record) • Many people understand them
2. Some new methods provide fields of displacement vectors – how can this information best be used? Is it useful to combine distance and intensity errors into a single metric (e.g., Venugopal et al. 2005, Keil and Craig 2007)? • Need to be very careful in looking at displacements • Can be due to bias (e.g., different sizes of systems) • Especially of concern when there isn’t clear displacement • However - there are different dimensions to this question • For CRAs the displacement error is very useful, could be very useful for forecasters • May be more meaningful to look at this at different scales – consider a pyramidal approach of moving from one scale to another • Also could define the scale of interest (e.g., for sea breezes) • Or could “step through” different scales To be continued…
2. Some new methods provide fields of displacement vectors – how can this information best be used? Is it useful to combine distance and intensity errors into a single metric (e.g., Venugopal et al. 2005, Keil and Craig 2007)? • Combining distance and intensity into single metric • Some methods do this • Model developer might not want it • Need to consider the user in determining how to weight these • Similar to “Integrity-Fidelity” concepts • Measure trade-offs from one attribute to another – apply costs to different types of error • Sukanta and Efi also have used this kind of approach • Used Hausdorf metric instead because of computational costs, but could probably be more efficient now. • Also requires “equal mass” in the fields – i.e., unbiased • Also like a combined energy measure
5. When using object-based verification methods how do errors in object matching impact upon the interpretation of verification results? Is it important that matched objects be physically similar? • Probably should weight intensity more than overall shape/geometry in matching objects • Could use other parameters to help with matching (e.g., in cluster analysis, MODE) • Might want to use additional meteorological parameters to make matching more robust in physical terms (to guarantee matched objects are generated by the same physical processes) • Ex: Could look at temperature to help identify frontal regions • New verification methods bring in a new level of uncertainty – in the matching and merging (how do we explain this to managers, etc.?) • Different people (humans) also would match things in different ways To be continued…
5. When using object-based verification methods how do errors in object matching impact upon the interpretation of verification results? Is it important that matched objects be physically similar? • Looking at time dimension might help a lot – tracking objects through time may make the results more robust • However, this is a problem that is faced by nowcasting, and they have not solved this problem • Lifetime scale of systems may not be long enough to be completely helpful with NWP • Also always a spectrum of systems going on at the same time (e.g., issues of merging, splitting, etc.) • This issue is of critical concern for all object methods – needs much deeper analysis • Could make use of the SPC spring program to help with this. • Methods also provide some information about the strength of a match, and this could be used in some way, rather than making a strict cutoff to define a match.
6. How should results from object-based methods be aggregated? • Curse of dimensionality… • Amount of info is overwhelming. What do you present? (e.g., in a verification system) • Criteria: Should take less time than looking at individual cases • Or you could think of it as a wealth of information – can answer many different questions • Provide info and allow users to select what they want to look at • Scale aggregation approach is informative • Stratification may be one of the most important aspect to allow more meaningful interpretation • Allow the toolkit to take in user-defined stratifications (e.g., locations of shallow cold pools); or other kinds of information (e.g., bad air traffic data) • Stratify by forecast characteristics • Some of these may have to wait for later versions of the toolkit • Wind rose type of diagram is useful for showing displacement distributions
8. Should spatial verification methods account for variations and/or directionality in surface properties (land/sea, topography)? If so, how? • Accounting for this should be considered but may only be an issue for some methods; • Others might not be affected • However might provide information for users (could be a stratification) • This is clearly an issue for the fuzzy neighborhood methods – should be considered • Further research needed
9. Some of the newer diagnostic methods have tunable parameters. How should they be treated in routine automated verification? • Need to look at a lot of cases • Don’t know how the systems will perform on different types of systems and seasons • Need a verification testbed to test the verification methods • Everyone will need to run new methods on identified test cases • Parameters may need to be adjusted for different kinds of situations • Look at cases ahead of time to help define what the best parameter choices are • Automated approach for this? • However – different users may have different needs • May have to be able to tune the parameters themselves • Fuzzy gui To be continued…
9. Some of the newer diagnostic methods have tunable parameters. How should they be treated in routine automated verification? • If parameters are tunable, guidance will need to be given to users • It is possible that some of the parameters may be “fix-able” • May depend on some other aspects of the weather (e.g., ave daily precip) • Some methods use some kind of a Bayesian approach that would make this objective • However, these depend on “optimizing” on some scoring rule, which may not be what we are really after
4. How can results from diagnostic verification methods be explained to administrators or non-scientists? Is this important?
3. How can spatial verification methods be adapted to incorporate the time dimension?
7. Are there additional methods from the image processing community as used in medical, land use, motion picture, and military applications, that can be applied to spatial verification in meteorology? • 10. What granularity of forecast and observation data (i.e., spatial and temporal resolution) is required to apply spatial approaches and methods for evaluation of spatial forecats?
6. Are there additional methods from the image processing community as used in medical, land use, motion picture, and military applications, that can be applied to spatial verification in meteorology?