280 likes | 463 Views
Verification methods - towards a user oriented verification. WG5. Modeller. Forecast. Analysis. Enduser. Data control. Data control. Course of verification on principal. Verification. Observation. What are the results of verification?. RMSE |. BIAS. S1 |ANOC. ETS. FBI. BSS. |ROC.
E N D
Verification methods - towards a user oriented verification WG5
Modeller Forecast Analysis Enduser Datacontrol Datacontrol Course of verification on principal Verification Observation
What are the results of verification? RMSE | BIAS S1 |ANOC ETS FBI BSS |ROC ISS FSS
Attributes of a forecasts related to observations(I) • Bias - the correspondence between the mean forecast and mean observation. • Association - the strength of the linear relationship between the forecasts and observations (for example, the correlation coefficient measures this linear relationship) • Accuracy - the level of agreement between the forecast and the truth (as represented by observations). The difference between the forecast and the observation is the error. The lower the errors, the greater the accuracy. • Skill - the relative accuracy of the forecast over some reference forecast. The reference forecast is generally an unskilled forecast such as random chance, persistence (defined as the most recent set of observations, "persistence" implies no change in condition), or climatology. Skill refers to the increase in accuracy due purely to the "smarts" of the forecast system. Weather forecasts may be more accurate simply because the weather is easier to forecast -- skill takes this into account. http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html#What%20makes%20a%20forecast%20good referring to: A.H. Murphy, Weather and Forecasting, 8(1993), Iss.2,281-293
Attributes of a forecasts related to observations(II) • Reliability - the average agreement between the forecast values and the observed values. If all forecasts are considered together, then the overall reliability is the same as the bias. If the forecasts are stratified into different ranges or categories, then the reliability is the same as the conditional bias, • Resolution - the ability of the forecast to sort or resolve the set of events into subsets with different frequency distributions. This means that the distribution of outcomes when "A" was forecast is different from the distribution of outcomes when "B" is forecast. Even if the forecasts are wrong, the forecast system has resolution if it can successfully separate one type of outcome from another.
Attributes of a forecasts related to observations(III) • Sharpness - the tendency of the forecast to predict extreme values. To use a counter-example, a forecast of "climatology" has no sharpness. Sharpness is a property of the forecast only, and like resolution, a forecast can have this attribute even if it's wrong (in this case it would have poor reliability). • Discrimination - ability of the forecast to discriminate among observations, that is, to have a higher prediction frequency for an outcome whenever that outcome occurs. • Uncertainty - the variability of the observations. The greater the uncertainty, the more difficult the forecast will tend to be.
Current focal points of verification • Spatial verification methods • object oriented methods • „fuzzy“- techniques • Verification of probabilistic and ensemble forecasts • ensemble pdf • generic probability forecasts • probability of an event • Verification of extreme (rare) events • high-impact events • Operational verification • evaluation and monitoring • User-oriented verification strategies • tailored verification for any user • Forecast value • cost - loss analysis, development of an universal score • Verification packages • VERSUS, MET
Some problems concerning significance • Some scores have a statistical outfit and seem to be open for significance tests. • Traditional significance tests require a defined number of degrees of freedom. • In most cases observations, forecasts and errors are correlated. • Therefore, the degrees of freedom cannot be obtained easily. • One way out: resampling and bootstrapping • What about statistical significance and meteorological significance?
User-oriented verification strategies: What are the interests of any users? • Administrator: • Did forecasts yield to better results during last period of interest and in general? • What type of focal points for model development are of current interest? • ... • Modeller: • What type of errors occur in general? • What are the reasons for such errors? • How should the model modified in order to avoid or to reduce these errors? • If one has found the reason(s) for the error(s) and one has reduced the effect(s), is the forecast then improved? • ... • External users and forecasters: • How can I interpret the forecasts? • What is the benefit of forecasts for me? • ...
Summer 2005 Winter 2005/2006 User-oriented verification step by step1. Diagnosis of errors - normally done by examining the BIAS or the FBI The problem - mean values of observed and forecasted T2m over Germany during Sommer 2005 and Winter 2005/2006 (RMSE/STDV)
COSMO-DE COSMO-EU User-oriented verification step by step1. Diagnosis of errors - normally done by examining the BIAS or the FBI The problem - mean values of observed and forecasted gusts over Germany during Spring 2007 (RMSE/STDV)
User-oriented verification step by step1. Diagnosis of errors - normally done by examining the BIAS or the FBI Examples for four scores in four stylisized situations:
Forecasted and observed values of surface level pressure over the region of Germany during DJF 2005/2006 (RMSE and STDV) Forecasted and observed values of surface level pressure over the region of Germany during DJF 2005/2006 observed and forecasted values lowerthan 1020 hPa (RMSE and STDV) Forecasted and observed values of surface level pressure over the region of Germany during DJF 2005/2006 observed and forecasted values higherthan 1020 hPa (RMSE and STDV) An example for conditional verification
User-oriented verification step by step2. Some changes made by modellers • New diagnosis of gusts • to reduce the overestimation of gusts: • use wind at 10 m instead of interpolated wind from 30 m to compute gusts • New diagnosis of temperature 2m • to reduce the strong negative bias during winter and get a more realistic diurnal cycle: • set z0 to 2 cm over land • New SSO scheme (currently under examination)
User-oriented verification step by step3. The effects • New diagnosis of gusts • The overestimation of gusts is now reduced. • But: Extreme gusts are underestimated. • New diagnosis of temperature 2m • Systematic negative bias during winter is reduced now. • Diurnal cycle seems to be more realistic, • But: Positive bias occurs during night and summer. • New SSO scheme (currently under examination)
User-oriented verification step by step4. The proof of the effects: New diagnosis of gusts gusts > 12 ms-1 Böenverifikation der ExperimenteExp. 6278 (COSMO-EU) Operational run Exp. 6301 (COSMO-DE) ETS FBI
User-oriented verification step by step4. The proof of the effects: New diagnosis of temperature 2m Comparison ofCOSMO-EU with experiment 6343 00 UTC:April/June 2007 : RMSE COSMO-EU area
A basic law during model development: There are no gains without any losses!(maybe with some exceptions) Therefore, one has to look both at benefits and risks.
One of known exceptions:The effect of a SSO scheme in COSMO-EU New SSO scheme reference experiment
User-oriented verification step by step5. The risk: New diagnosis of gusts gusts > 25 ms-1 Böenverifikation der ExperimenteExp. 6278 (COSMO-EU) Operational run Exp. 6301 (COSMO-DE) ETS FBI
User-oriented verification step by step5. The risk: New diagnosis of gusts windgust - old vs new for 16.01.-17.03.08 over Switzerland old new
User-oriented verification step by step5. The risk: New diagnosis of temperature 2m Comparison ofCOSMO-EU with experiment 6343 00 UTC:April/June 2007 : BIAS COSMO-EU area
User-oriented verification step by step6. The operational effect: New diagnosis of temperature 2m impact on mean diurnal cycle for stations over Switzerland Summer 2007 Summer 2008 The (well known) errors of: - too strong temperature increase in the morning- maxima reached ~ 1.5-2 h too earlyis removed with the new 2m temperature diagnostics (introduced operationally 12.03.2008 @ DWD and 09.06.2008 @ MeteoSwiss)
User-oriented verification step by step6. The operational effect: New diagnosis of temperature 2m for stations over Germany
User-oriented verification step by step7. The effect for administrators experessed in „The Score COSI“
User-oriented verification step by step7. The effect for administrators experessed in „The Score COSI“ asr Prognostic cloud ice Prognostic precipitation LME V 3.19 V 3.22 T 2m
User-oriented verification step by step8. Questions Are there any questions to WG5? Question from WG5:What are the requirements to the verification process by users in order to make the process of model development as effective as possible?
Verification surface weather elements, RMSE of 10m-windspeed: Comparison of LM (COSMO-EU) and GME: 2002-2007