470 likes | 669 Views
Verification of forecasts from the SWFDP – SE Asia. Richard (Rick)Jones Regional Training Workshop on Severe Weather Forecasting Macau, April 8 -13 , 2013. Verification. lawrence.wilson@ec.gc.ca WMO sponsored Joint Working Group on Forecast Verification Research JWGFVR
E N D
Verification of forecasts from the SWFDP – SE Asia Richard (Rick)Jones Regional Training Workshop on Severe Weather Forecasting Macau, April 8 -13 , 2013
Verification • lawrence.wilson@ec.gc.ca • WMO sponsored Joint Working Group on Forecast Verification Research JWGFVR • http://www.wmo.int/pages/prog/arep/wwrp/new/Forecast_Verification.html • http://www.cawcr.gov.au/projects/verification/
Resources • The EUMETCAL training site on verification – computer aided learning: http://www.eumetcal.org/resources/ukmeteocal/verification/www/english/courses/msgcrs/index.htm • The website of the Joint Working Group on Forecast Verification Research: http://www.cawcr.gov.au/projects/verification/ • WMO/TD 1083 : Guidelines on performance assessment of the performance of Public Weather Systems
Why? • “you can't know where you're going until you know where you've been” Proverb or • George Santayana-"Those who are unaware of history are destined to repeat it.” • Quality management- “Plan Do Check Act” – Deming • How to verify ….“Begin with the end in mind” … Covey
Verification as a measure of Forecast quality • to monitor forecast quality • to improve forecast quality • to compare the quality of different forecast systems
Introduction • “Verification activity has value only if the information generated leads to a decision about the forecast or system being verified” – A. Murphy • “User-Oriented Verification” • Verification methods designed with the needs of a specific user in mind. • “Users” are those who are interested in verification results, and who will take action based on verification results • Forecasters, modelers are users too.
SWFDP Goals PROGRESS AGAINST SWFDP GOALS To improve the ability of NMSs to forecast severe weather events To improve the lead time of alerting these events To improve the interaction of NMSs with Disaster Management and Civil Protection authorities before, during and after severe weather events To identify gaps and areas for improvements To improve the skill of products from Global Centres through feedback from NMSs • EVALUATION OF WEATHER WARNINGS • Feedback from the public • Feedback from the DMCPA to include comments of the timeliness and usefulness of the warnings • Feedback from the media • Warning verification by the NMCs
Introduction • Importance of verification • Relative ease by any group can put out of forecast for any where in the world • How good are they? • Product differentiation
Goals of Verification • Scientific • To identify the strengths and weaknesses of a forecast product in sufficient detail that actions can be specified that will lead to improvements in the product, ie to provide information to direct R&D. • Demands more detail in verification methodology “diagnostic verification” • SWFDP: Both administrative goals and scientific goals
Forecast “goodness” • What makes a forecast good? • QUALITY: How well it corresponds with the actual weather, as revealed by observations. (Verification) • VALUE: The increase or decrease in economic or other value to a user, attributable to his use of the forecast. (satisfaction) • Requires information from the user to assess, in addition to verification • Can be assessed by methods of decision theory. (Cost-Loss etc)
Principles of (Objective) Verification • Verification activity has value only if the information generated leads to a decision about the forecast or system being verified • User of the information must be identified • Purpose of the verification must be known in advance • No single verification measure provides complete information about the quality of a forecast product.
The contingency Table Observations Yes No Yes Forecasts No
Preparation of the table • Start with matched forecasts and observations • Forecast event is precipitation >50 mm / 24 h Next day • Threshold – medium risk • Count up the number of each of hits, false alarms, misses and correct negatives over the whole sample • Enter them into the corresponding 4 boxes of the table.
Exercice • Mozambique contingency table • Review on Thursday 11 April
Outline • Introduction: • Purposes and Principles of verification • Some relevant verification measures: Contingency table and scores • Verification of products from the SWFDP • Verification of probability forecasts • Exercise results and interpretation
Forecast “goodness” • Evaluation of forecast system • Forecast goodness • Evaluation of delivery system • timeliness (are forecasts issued in time to be useful?) • relevance (are forecasts delivered to intended users in a form they can understand and use?) • robustness (level of errors or failures in the delivery of forecasts)
Principles of (Objective) Verification • Forecast must be stated in such a way that it can be verified • What about subjective verification? • With care, is OK. If subjective, should not be done by anyone directly connected with the forecast. • Sometimes necessary due to lack of objective information
Goals of Verification • Administrative • Justify cost of provision of weather services • Justify additional or new equipment • Monitor the quality of forecasts and track changes • Usually means summarizing the verification into few numbers (scoring) • Impact - $ and injuries
Verification Procedure • Start with dataset of matched observations and forecasts • Data preparation is the major part of the effort of verification • Establish purpose • Scientific vs. administrative • Pose question to be answered, for specific user or set of users • Stratification of dataset • On basis of user requirements (seasonal, extremes etc) • Take care to maintain sufficient sample size
Verification Procedure • Nature of variable being verified • Continuous: Forecasts of specific value at specified time and place • Categorical: Forecast of an “event”, defined by a range of values, for a specific time period, and place or area • Probabilistic: Same as categorical, but uncertainty is estimated • SWFDP: Predicted variables are categorical: Extreme events, where extreme is defined by thresholds of precipitation and wind. Some probabilistic forecasts are available too.
What is the Event? • For categorical and probabilistic forecasts, one must be clear about the “event” being forecast • Location or area for which forecast is valid • Time range over which it is valid • Definition of category • Example?
What is the Event? • And now, what is defined as a correct forecast? A “hit” • The event is forecast, and is observed – anywhere in the area? Over some percentage of the area? • Scaling considerations • Discussion:
Events for the SWFDP • Best if “events” are defined for similar time period and similar-sized areas • One day 24h • Fixed areas; should correspond to forecast areas and have at least one reporting stn. • The smaller the areas, the more useful the forecast, potentially, BUT… • Predictability lower for smaller areas • More likely to get missed event/false alarm pairs
Events for the SWFDP • Correct negatives a problem • Data density a problem • Best to avoid verification where there is no data.
The contingency Table Observations Yes No Yes Forecasts No
Contingency tables Observations range: 0 to 1 best score = 1 Forecasts range: 0 to 1 best score = 0 • Characteristics: • PoD= “Prefigurance” or “probability of detection”, “hit rate” • Sensitive only to missed events, not false alarms • Can always be increased by overforecasting rare events • FAR= “False alarm ratio” • Sensitive only to false alarms, not missed events • Can always be improved by underforecasting rare events
Contingency tables Observations range: 0 to 1 best score = 1 Forecasts best score = 1 • Characteristics: • PAG= “Post agreement” • PAG= (1-FAR), and has the same characteristics • Bias: This is frequency bias, indicates whether the forecast distribution is similar to the observed distribution of the categories (Reliability)
Contingency tables Observations Forecasts range: 0 to 1 best score = 1 • Characteristics: • Critical success index, better known as the Threat Score • Sensitive to both false alarms and missed events; a more balanced measure than either PoD or FAR
Contingency tables Observations Forecasts range: negative value to 1 best score = 1 • Characteristics: • Heidke skill score against chance (as shown) • Easy to show positive values • Better to use climatology or persistence • needs another table
Contingency tables Observations range: 0 to 1 best score = 1 Forecasts best score = 0 • Characteristics: • Hit Rate (HR) is the same as the PoD and has the same characteristics • False alarm RATE. This is different from the false alarm ratio. • These two are used together in the Hanssen-Kuipers score, and in the ROC, and are best used in comparison.
Contingency tables Observations Forecasts range: -1 to 1 best score = 1 • Extreme dependency score characteristics: • Score can be improved by incurring more false alarms • Considered useful for extremes because does not converge to 0 as • the base rate (observed frequency of events) decreases • A relatively new score – not yet widely used.
Contingency tables Observations Forecasts range: 0 to 1 best score = 1 • Characteristics: • Better known as the Threat Score • Sensitive to both false alarms and missed events; a more balanced measure than either PoD or FAR
Contingency tables Observations range: 0 to 1 best score = 1 Forecasts best score = 0 • Characteristics: • Hit Rate (HR) is the same as the PoD and has the same characteristics • False alarm RATE. This is different from the false alarm ratio. • These two are used together in the Hanssen-Kuipers score, and in the ROC, and are best used in comparison.
Extreme weather scores • Extreme Dependency Score EDS • Extreme Dependency Index EDI • Symmetric Extremal Dependency Score SEDS • Symmetric Extremal Dependency Index SEDI
Verification of extreme, high-impact weather • EDS – EDI – SEDS - SEDI Novelty categorical measures! • Standard scores tend to zero for rare events Ferro & Stephenson, 2010: Improved verification measures for deterministic forecasts of rare, binary events. Wea. and Forecasting Base rate independence Functions of H and F Extremal Dependency Index - EDI Symmetric Extremal Dependency Index - SEDI
Example - Madagascar 211 Cases Separate tables assuming low, medium, high risk as thresholds Can plot the hit rate vs the false alarm RATE = FA/total obs no
Discrimination • User-perspective: • Does the model or forecast tend to give higher values of precipitation when heavy precipitation occurs than when it doesn’t? (or temperature?)
Misses Hits False alarms Observed Forecast Contingency Table for spatial data • Possible interpretation for spatially defined threat areas: • Put grid of equal area boxes over overlaid obs and fcsts • Entries are just the number of boxes covered by the areas as shown. • Correct negatives problematic, but could limit to total forecast domain • Likely to result in overforecasting bias – different interpretation? • Can be done only where spatially continuous obs and forecasts are available – hydro estimator?
Probability forecast verification – Reliability tables • Reliability: • The level of agreement between the forecast probability and the observed frequency of an event • Usually displayed graphically • Measures the bias in a probability forecast: Is there a tendency to overforecast or underforecast. • Cannot be evaluated on a single forecast.
Summary – NMS products • Warnings issued by NMSs • Contingency tables as above, if enough data is gathered • Important for a warning to determine the lead time – must archive the issue time of the warning and the occurrence time of the event. • Data problems – verify the “reporting of the event”
Summary and discussion…. • Summary • Keep the data! • Be clear about all forecasts! • Know why you are verifying and for whom! • Keep the verification simple but relevant! • Just do it! • Case studies – post-mortem
SWFDP verification Thank you