150 likes | 313 Views
Fuzzy verification using the Fractions Skill Score. Marion Mittermaier and Nigel Roberts Spatial verification methods intercomparison meeting, Boulder, 20.02.07. Verification approach. We want to know How the forecast skill varies with neighbourhood size.
E N D
Fuzzy verification using the Fractions Skill Score Marion Mittermaier and Nigel Roberts Spatial verification methods intercomparison meeting, Boulder, 20.02.07
Verification approach We want to know How the forecast skill varies with neighbourhood size. The smallest neighbourhood size that can be used to give sufficiently accurate forecasts. Does higher resolution provide more accurate forecasts on scales of interest (e.g. river catchments) Compare forecast fractions with fractions from radar over different sized neighbourhoods (squares for convenience) using GRIDDED data. Use rainfall accumulations to apply temporal smoothing Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events by Roberts and Lean (accepted in MWR, Feb 2007)
Schematic comparison of fractions observed forecast Threshold exceeded where squares are blue
Skill score for fractions/probabilities - Fractions Skill Score (FSS) A score for comparing fractions with fractions Brier score for comparing fractions
Measures skill on fair terms from the model perspective. It gets round the double penalty problem by sampling around precipitation areas. It can be used to determine the scale over which a forecast system has sufficient skill. The method is intuitive and can be directly related to the way forecasts are presented. i.e. generating spatial probability forecasts. It is particularly useful for high-resolution precipitation forecasts in which we expect the fine detail to be unpredictable. It can be used for single or composite events. 1. The spatial skill signal may be swamped by the bias. 2. Sensitivity to small base rates at higher thresholds, i.e. is threshold dependent (as any method using thresholds!) 3. Like any score, it doesn't tell the whole story on its own. Strengths and weaknesses Weaknesses Strengths
Max = 94 mm (3.7 in) Max = 78 mm (3.1 in) Hourly accumulations Max = 98 mm (3.9 in) Max = 48 mm (1.9 in)
Physical thresholds Increasing bias for higher thresholds mm in 0.04 0.08 0.16 0.32 0.64 1.28 ~60 mi
Frequency thresholds Top 25, 5 and 1 % of the distribution (including zeros) representative of rain/no rain boundary 0.5-1 mm (0.02-0.04 in) 4-6 mm (0.16-0.24 in) ~60 mi • >75% zeros in domain • top 1% are values ~ 5mm (0.2 in) or more (large range) • 1% of pixels is ~ 3000 (still a lot)
Max = 120 mm (4.7 in) Max = 84 mm (3.3 in) Hourly accumulations Max = 69 mm (2.7 in) Max = 74 mm (2.9 in)
Physical thresholds mm in 0.04 0.08 0.16 0.32 0.64 1.28 ~60 mi
Frequency thresholds 0.5-1 mm (0.02-0.04 in) representative of rain/no rain boundary 4-6 mm (0.16-0.24 in) ~60 mi
Issues The following points have cropped up and are listed here as general issues or specific to the FSS. At the very least they require a bit more thought and possibly some extra tests. • Currently FSS is computationally expensive (run time dependent on domain size). • Results may be domain size dependent (a larger domain gives the scope for larger spatial errors). Other spatial methods may suffer in the same way. (Do we know enough about this?) • Independence issues (regarding adjacent pixels). This affects all spatial-based methods. (Should we be worried?) • Impact of data sparseness(?), domain edge effects. A bit of a grey area but again may apply more widely.