1 / 24

Good afternoon ! नमस्कार Guten Tag! Buenos dias ! до́брый день ! Qwertzuiop asdfghjkl !

Good afternoon ! नमस्कार Guten Tag! Buenos dias ! до́брый день ! Qwertyuiop asdfghjkl ! Bom dia ! Bonjour !. Good afternoon ! नमस्कार Guten Tag! Buenos dias ! до́брый день ! Qwertzuiop asdfghjkl ! Bom dia ! Bonjour !. Please , verify !. Verification of continuous variables

bluma
Download Presentation

Good afternoon ! नमस्कार Guten Tag! Buenos dias ! до́брый день ! Qwertzuiop asdfghjkl !

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Goodafternoon! नमस्कार Guten Tag! Buenos dias! до́брыйдень! Qwertyuiopasdfghjkl! Bomdia ! Bonjour! Goodafternoon! नमस्कार GutenTag! Buenos dias! до́брыйдень! Qwertzuiopasdfghjkl! Bomdia ! Bonjour! Please, verify !

  2. Verificationofcontinuous variables Martin Göber Deutscher Wetterdienst (DWD) Hans-Ertel-CentreforWeather Research (HErZ) Acknowledgements: Thanksto Barb Brown and Barbara Casatti!

  3. Types of forecasts, observations • Continuous • Temperature • Rainfall amount • 500 hPageopotential height • Categorical • Dichotomous • Rain vs. no rain • Thresholding of continuous variables • Strong winds vs. no strong wind • Often formulated as Yes/No • Multi-category • Cloud amount category • Precipitation type YY NY YN NN Except when it is meaningful, forecasts should not be degraded to categorical, due to the resulting loss of information.

  4. observation o forecast f The jointprobabilitydistribution p(f,o) (961 classes)*(100 stations)*(2 days)*(5 kindsofforecasts) = 1 Million numberstoanalyse „curseofdimensionality“ Boil down to a fewnumbers (little ?) lossofinformation Joint frequencydistribution, roadsurfacetemperature, winter 2011

  5. Continuousverification Normally distributed ERRORS 5

  6. Normallydistributederrors Iferrorsarenormallydistributed, then 2 parametersareenough, toanswer all questionsapproximately Ifsystematicerror („bias“) small, then Root(MSE )= Standard error

  7. Bias • mean error ME, ideally=0 • “systemtic error”  “on average, something goes wrong into one direction”, e.g. model physics wrongly tuned, missing processes, wrong interpretation of guidances • tells us nothing about the pairwise match of forecasts and observations • large in the past, rather small nowadays on average, but maybe large e.g. for certain weather types • misleading for multi-modal error distributions take Mean Absolute Error MAE

  8. ME and MAE Q: If the ME is similar to the MAE, performing the bias correction is safe, if MAE >> ME performing the bias correction is dangerous: why ? A: if MAE >>ME it means that positive and negative errors cancel out in the bias evaluation …

  9. RMSE • mean squared error or root mean square error RMSE • accuracy measure: determines the distance between individual forecasts and observations, • Ideally RMSE = 0 • “It might be useful on average, but when its really important its not good ! ????” NOT necessarily, e.g: • 1 five degree error is penalised like 25 one degree error • 1 ten degree error is penalised like 100 one degree errors

  10. Interpretation of RMSE Iferrorsnormallydistributed, then

  11. Decompositionofthe MSE Bias can be subtracted ! Consequence: smooth forecastsverifybetter

  12. Correlation coefficient • Measures the level of “association” between the forecasts and observations • Related to the “phase error” of the harmonic decomposition of the forecast • Is familiar and relatively easy to interpret • Has a nonparametric analog based on ranks

  13. Correlation coefficient

  14. Correlation coefficient

  15. Correlation coefficient What is wrong with the correlation coefficient as a measure of performance? Doesn’t take into account biases and amplitude – can inflate performance estimate More appropriate as a measure of “potential” performance

  16. Comparative verification • Generic skill score definition: Where M is the verification measure for the forecasts, Mref is the measure for the reference forecasts, and Mperfis the measure for perfect forecasts • Measures percent improvement of the forecast over the reference • Positively oriented (larger is better) • Choice of the standard matters (a lot!)

  17. Comparative verification Skill scores • A skill score is a measure of relative performance • Ex: How much more accurate are my temperature predictions than climatology? How much more accurate are they than the model’s temperature predictions? • Provides a comparison to a standard • Standard of comparison can be • Chance (easy?) • Long-term climatology (more difficult) • Sample climatology (difficult) • Competitor model / forecast (most difficult) • Persistence (hard or easy)

  18. Skillscores General skill score definition: ReductionoferrorVariance (also oftencalled „skill score“ SS)

  19. Higher skill Lower accuracy Accuracyvsskill 24h mean wind forecast Reducedvariance MSE(Persistence) MSE(forecast)

  20. „hits“ andRMSE “hits” = percentage of “acceptable” forecast errors (e.g. ICAO - dd:+-30°, ff:+-5kt bis 25kt, etc.) “hits” “hits” “hits” in % Forecast error in K

  21. „hits“ andRMSE “hits” “hits” “hits” in % Reductionof Error “mass“: Through reductionof large errors Forecast error in K

  22. Long termtrends Every 10 years one day better “Hit rate” (errors +- 2k) in % Maximum temperature Potsdam

  23. q0.75 Linear Error in Probability Space • LEPS is an MAE evaluated by using the cumulative frequencies of the observation • Errors in the tail of the distribution are penalized less than errors in the centre of the distribution

  24. Summary • Verification is a high dimensional problem  can be boiled down to a lower dimensional under certain assumptions or interests • If forecast errors are normally distributed, continuous verification allows usage of only a few numbers like bias and RMSE • Accuracy and skill are different things

More Related