Verifying Modelled Currents using Threshold Exceedance Approach

NPE - Cross-cutting research on verification techniques Presentation Session Code: SCI-PS153.03 Verifying modelled currents using a threshold exceedance approachDr Ray Mahdon An exploration of the Gerrity Skill Score

Verifying modelled currents using a threshold exceedance approach An exploration of the Gerrity Skill Score Table of Contents • Introduction • Data Source & Locations • Differing Current Regimes • Time Series, Continuous Statistics & Simple Cat. Metrics • Neighbourhood Methods • Bias Removal Questions • Multi-Cat. Metric – Gerrity Skill Score & Ocean Currents • Threshold Choices

Introduction • Surface currents forecasts important for commercial or defence “weather-windows” • e.g. Current speed below 1kt for 12 hours. • e.g. Does not exceed 1kt more than x times • Good for site-specific & threshold based analysis • Some questions we are trying to answer….. • Does the model capture extreme events or “weather-windows”? • In which locations or time of year do the models have the best performance; is there a significant difference in regime, time or area?

Donostia 62025 6201030 62083 Matxitxako 62024 Shelf Circulation 61430 Wind & Tidal Currents 61280 61281 Eddies General Ocean Circulation 61417 62085 61198 Slope Current MyOcean - Puertos Del Estado 26-56N,19W-5E Data Source & Locations

Data, Time Series & Continuous Statistics • Hourly frequency, Jan 2012 – Jun 2014 (30 months) • Collocated model & In-Situ moored observation surface currents • Continuous statistics are helpful to describe overall behaviour • e.g. q-q & histogram plots describe climatology • Timeseries can show seasonal patterns or significant events • Do not quantify the performance of a system when exceeding thresholds is of interest • We focus on surface currents • validation is relatively sparse for this parameter • → Categorical Metric Assessment • Simple 2x2 (binary) contingency table per chosen threshold

Neighbourhoods: 1x1, 3x3, 5x5,..,NxN Combinations spatial & temporal neighbourhoods trialled T+1 T+0 T-1 Neighbourhood Sampling Spatial Neighbourhoods Temporal Neighbourhoods Time averaging & shifting

CORR. REJ. CSI ETS F. ALARMS HITS MISSES Simple Categorical Metrics Improvements from temporal averaging hour-hour assessment not good as CSI → ETS says model mostly correct by chance! CSI & ETS require un-biased input data Over what period should a tidally dominated field be normalised:– 1 tidal cycle; spring-neap cycle; astronomical cycle? How to handle –ve currents?

Multi-Categorical Metric Method The Gerrity Skill Score

Gerrity* Skill Score (GSS) • Refinement of binary categorical methods • Does not depend on the forecast distribution • Rewards/penalises for rare(extreme)/disparate events • does not reward conservative forecasting • Large choice of threshold divisions • Good observation (sample) climatology required • Contingency table distribution leads to scoring matrix • Equitable (i.e., random & constant forecasts score a value of 0) GSS=0.38 × * Gerrity, J.P., (1992), Monthly Weather Review, 120, 2709-2712.

GSS - Threshold Choices 1 year rolling data per point, captured from 2 ½ years (365 × 24 = 8760 pts. – a good climatology!) Skewed Thresholds [0.10,0.25,0.45,0.7] Equal Frequency Distribution [20,40,60,80] percentiles Variability in skill versus thresholds, neighbourhood & time Clues in events from time series & data captured

Equal Frequency Distribution = [0.07 , 0.12 , 0.18 , 0.25] Daily Max/Min Current Speed - 62024 Skewed Thresholds = [0.10 , 0.25 , 0.45 , 0.70] GSS - Threshold Choices Cont. Mean error = -0.03 ms-1 RMSE = 0.11 ms-1

Equal Frequency Distribution = [0.05 , 0.1 , 0.15 , 0.2] Skewed Thresholds = [0.1 , 0.25 , 0.45 , 0.7] GSS - Threshold Choices Cont. Daily Max/Min Current Speed - 62024 Mean error = -0.03 ms-1 RMSE = 0.11 ms-1

OBS C<=0.25 OBS 0.25<C<=0.5 GSS=0.7 0.09 -1.00 FC C<=0.25 272 6 × -1.00 11.52 FC 0.25<C<=0.5 16 19 GSS - Threshold Choices Cont. 1 year’s data captured from 2 ½ years (365 × 24 = 8760 pts. – a good climatology ) Equal Frequency Distribution Regular Thresholds CHECK YOUR ANALYSIS Multi-Category test reduced to 2x2 in many cases! Equal Frequency Distribution = [0.07 , 0.12 , 0.18 , 0.25] Regular Thresholds = [0.25 , 0.5 , 0.75 , 1.0]

Other trials & results • Various spatial & temporal neighbourhoods • Report similar results • Preliminary results on other model systems show similar skill scores • Met Office FOAM-Shelf system • Maximum skill versus neighbourhood size • Other binning thresholds • No firm a priori binning remains a deficiency • Decoupling tidal cycle & residual current from raw signal to highlight skill partitioning • Doodson sea surface height decoupler trialled • Separation of potentially non-parallel (orthogonal) fields not addressed

Conclusions

Conclusions • Hourly frequency currents, Jan 2012 – Jun 2014 (30 months) • Threshold based assessment • Continuous statistics are helpful to describe overall behaviour • Timeseries can show seasonal patterns • Does not quantify spatial or temporally coordinated model/obs values • → Categorical Metric Assessment • Gerrity Skill Score – attractive attributes for rewards/penalties

Conclusions cont. • Choice of thresholds important • Model CAN CAPTURE EXTREME EVENTS – Threshold dependent ! • Equal Frequency Distribution appears to be the fairest a priori • Can be personalised to a particular regime or current distribution • Timeseries needed alongside Gerrity • Missing data can skew results • Similar locations/regimes appear to give broadly similar Gerrity Skill Scores • Winter months tend to show better skill – more extreme events • Multi-category methods on surface ocean current speed are relatively new, so expectation of skill level is unknown

Future Work • Now concept established, apply to forecast data • Include other regional models which have long-term observation record • Bootstrapping Gerrity Skill Score • Error estimation around each score • Return to bias removal issue • Scaled currents, rather than constant removal? • Assess wind speed with Gerrity Skill Score & compare to surface currents • Potentially highlights efficiency of wind speed transmission to surface currents in Ocean:Atmosphere boundary

Acknowledgement • Thank you to MyOcean for funding towards this work

THANK YOU FOR YOUR ATTENTIONAny Questions (& answers)?

Verifying Modelled Currents using Threshold Exceedance Approach