310 likes | 519 Views
Spatial verification of NWP model fields. Beth Ebert BMRC, Australia. New approaches are needed to quantitatively evaluate high resolution model output. It's not so easy!. How can I score?. What modelers want. Diagnostic information What scales are well represented by the model?
E N D
Spatial verification of NWP model fields Beth Ebert BMRC, Australia WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
New approaches are needed to quantitatively evaluate high resolution model output WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
It's not so easy! How can I score? What modelers want • Diagnostic information • What scales are well represented by the model? • How realistic are forecast features / structures? • How realistic are distributions of intensities / values? • What are the sources of error? • How can I improve the model? WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Spatial forecasts Spatial verification techniques aim to: • account for field spatial structure • provide information on error in physical terms • account for uncertainties in timing and location Weather variables defined over spatial domains have coherent spatial structureand features (intrinsic spatial correlation) WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Recent research in spatial verification • Scale decomposition methods • measure scale-dependent error • Fuzzy (neighborhood) verification methods • give credit to "close" forecasts • Object-oriented methods • evaluate attributes of identifiable features • Field verification • evaluate phase errors WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Scale decomposition methodsscale-dependent error WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Wavelet scale components Briggs and Levine (1997) ECMWF Analysis 36-h Forecast (CCM-2) 500 mb GZ, 9 Dec 1992, 12:00 UTC, N. America WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Intense storm displaced Skill scale (km) 1 0 -1 -2 -3 -4 640 320 160 80 40 20 10 5 0 1/16 ¼ ½ 1 2 4 8 16 32 threshold = 1mm/h threshold (mm/h) Intensity-scale verification techniqueCasati et al. (2004) Measures the skill as function of intensity and spatial scale of the error Intensity: thresholdCategorical approach Scale:2D Wavelets decomposition of binary images For each threshold and scale: skill score associated to the MSE of binary images = Heidke Skill Score
Multiscale statistical propertiesHarris et al. (2001) Does a model produce the observed precipitation scale-dependent variability, i.e. does it look like real rain? Compare multi-scale statistics for model and radar data Power spectrum Structure function Moment scaling WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Fuzzy (multi-scale) verification methodsgive credit to "close" forecasts WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
t - 1 Why is it called "fuzzy"? • Look in a space / time neighborhood around the point of interest • Evaluate using categorical, continuous, probabilistic scores / methods t Frequency t + 1 Squint your eyes! Forecast value observation observation forecast forecast "Fuzzy" verification methods • Don't require an exact match between forecasts and observations • Unpredictable scales • Uncertainty in observations WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
"Fuzzy" verification methods Treatment of forecast data within a window: • Mean value (upscaling) • Occurrence of event* somewhere in window • Frequency of event in window probability • Distribution of values within window May apply to observations as well as forecasts (neighborhood observation-neighborhood forecast approach) * Eventdefined here as a value exceeding a given threshold, for example, rain exceeding 1 mm/hr WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
ROC single threshold Sydney "high probability of some heavy rain near Sydney", not "62 mm of rain will fall in Sydney" EPS Spatial multi-event contingency tableAtger (2001) Vary decision thresholds: • magnitude (ex: 1 mm h-1 to 20 mm h-1) • distance from point of interest (ex: within 10 km, .... , within 100 km) • timing (ex: within 1 h, ... , within 12 h) • anything else that may be important in interpreting the forecast Forecasters mentally "calibrate" the deterministic forecast according to how close the forecast is to the place / time / magnitude of interest. Very close high probability Not very close low probability WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Compare forecast fractions with observed fractions (radar) in a probabilistic way over different sized neighbourhoods observed forecast Fractions skill scoreRoberts (2005) • We want to know • How forecast skill varies with neighbourhood size. • The smallest neighbourhood size that can be can be used to give sufficiently accurate forecasts. • Does higher resolution provide more accurate forecasts on scales of interest (e.g. river catchments) WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Fractions skill scoreRoberts (2005) WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Decision models *NO-NF = neighborhood observation-neighborhood forecast, SO-NF = single observation-neighborhood forecast WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
good performance poor performance Fuzzy verification framework WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Object-oriented methodsevaluate attributes of features WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Observed Forecast Entity-based approach (CRA)Ebert and McBride (2000) • Define entities using threshold (Contiguous Rain Areas) • Horizontally translate the forecast until a pattern matching criterion is met: • minimum total squared error between forecast and observations • maximum correlation • maximum overlap • The displacement is the vector difference between the original and final locations of the forecast. WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
CRA information Gives information on: • Location error • RMSE and correlation before and after shift • Attributes of forecast and observed entities • Error components • displacement • volume • pattern WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
MODE*Davis et al. (2006) *Method for Object-based Diagnostic Evaluation • Two parameters: • Convolution radius • Threshold WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
MODE object matching/merging Compare attributes: - centroid location - intensity distribution - area - orientation - etc. When objects not matched: - false alarms - missed events - rain volume - etc. 24h forecast of 1h rainfall on 1 June 2005 WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Convolution – threshold process Identification Fuzzy Logic Approach Compare forecast and observed attributes Merge single objects into composite objects Compute interest values Identify matched pairs Measure Attributes Merging Matching Comparison Accumulate and examine comparisons across many cases Summarize MODE methodology WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
MM5 precipitation forecasts 8 clusters identified in x-y-p space Cluster analysis approach Marzban and Sandgathe (2006) • Goal: Assess the agreement between fields using clusters identified using agglomerative hierarchical cluster analysis (CA) • Optimize clusters (and numbers of clusters) based on • Binary images (x-y optimization) • Magnitude images (x-y-p optimization) • Compute Euclidean distance between clusters in forecast and observed fields (in x-y and x-y-p space) WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Cluster analysis example Error = average distance between matched clusters in x-y-p space Stage IV COAMPS loge error WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Observation Forecast x Event center Composite approach Nachamkin (2004) • Goal: Characterize distributions of errors from both a forecast and observation perspective • Procedure: • Identify events of interest in the forecasts • Define a kernel and collect coordinated samples • Compare forecast PDF to observed PDF • Repeat process for observed events WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Composite example • Compare kernel grid-averaged values Average rain (mm) given an event was predicted Average rain (mm) given an event was observed FCST-shade OBS-contour WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Field verification evaluate phase errors WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Original forecast Xf(r) 500 mb analysis Xv(r) Forecast adjustment Residual error er Adjusted forecast Xa(r) Feature calibration and alignment (Hoffman et al., 1995; Nehrkorn et al., 2003) Error decomposition e = Xf(r) - Xv(r) where Xf(r) is the forecast, Xv(r) is the verifying analysis, and r is the position. e = ep + eb + er where ep = Xf(r) - Xd(r) phase error eb = Xd(r) - Xa(r) local bias error er = Xa(r) - Xv(r) residual error WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Forecast quality measure (FQM)Keil and Craig (2007) • Combines distance measure and intensity difference measure • Pyramidal image matching (optical flow) to get vector displacement field edistance • Unmatched features are penalized for their intensity errors eintensity • Forecast quality measure satellite orig.model morphed model WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007
Conclusions • What method should you use for model verification? • Depends what question(s) you would like to address • Many spatial verification approaches • Scale decomposition – scale-dependent error • Fuzzy (neighborhood) – credit for "close" forecasts • Object-oriented – attributes of features • Field verification – phase error WRF Verification Toolkit Workshop, Boulder, 21-23 February 2007