490 likes | 668 Views
Toward Short-Range Ensemble Prediction of Mesoscale Forecast Skill. Eric P. Grimit University of Washington. Supported by: NWS Western Region/UCAR-COMET Student-Career Experience Program (SCEP) DoD Multi-Disciplinary University Research Initiative (MURI). Forecasting Forecast Skill.
E N D
Toward Short-Range Ensemble Prediction of Mesoscale Forecast Skill Eric P. Grimit University of Washington Supported by: NWS Western Region/UCAR-COMET Student-Career Experience Program (SCEP) DoD Multi-Disciplinary University Research Initiative (MURI) NSSL/SPC Spring Program Seminar
Forecasting Forecast Skill • Like any other scientific prediction or measurement, weather forecasts should be accompanied by error bounds, or a statement of uncertainty. • Atmospheric predictability changes from day-to-day, and is dependent on: • Atmospheric flow configuration • Magnitude/orientation of initial state errors • Sensitivity of flow to the initial state errors • Numerical model deficiencies • Sensitivity of flow to model errors T2m = 3 °C ± 2 °CP(T2m < 0 °C) = 6.7 % NSSL/SPC Spring Program Seminar
Showers Low 46°FHigh 54°F FRI 8 AM ShowersLow 47°FHigh 57°F SAT 5 Forecasting Forecast Skill • Operational forecasters need this crucial information to know how much to trust model forecast guidance • Current uncertainty knowledge is partial, and largely subjective • End users could greatly benefit from knowing the expected forecast reliability • Allows sophisticated users to make optimal decisions in the face of uncertainty (economic cost-loss or utility) • Common users of weather forecasts – confidence index Take protective action if: P(T2m < 0 °C) > cost/loss NSSL/SPC Spring Program Seminar
Probabilistic Weather Forecasts • One approach to estimating forecast uncertainty is to use a collection of different forecasts—an ensemble. • Ensemble weather forecasting diagnoses the sensitivity of the predicted flow to initial-state and model errors—provided they are well-sampled. NSSL/SPC Spring Program Seminar
Probabilistic Weather Forecasts • Agreement/disagreement among ensemble member forecasts provides information about forecast certainty/uncertainty. agreement disagreement better forecast worse forecast reliability reliability use ensemble forecast variance as a predictor of forecast skill NSSL/SPC Spring Program Seminar
Northwest MM5 SREF 10-m Wind Direction • Unique 5-member short-range ensemble developed in 2000 showed promise • Spread-skill correlations near 0.6, higher for cases with extreme spread [c.f. Grimit and Mass 2002] Observed Skill Predictions: A Disappointment NCEP SREF Precipitation Tropical Cyclone Tracks SAMEX ’98 SREFs [c.f. Goerss 2000] [c.f. Hou et al. 2001] [c.f. Hamill and Colucci 1998] Highly scattered relationship, thus low correlations NSSL/SPC Spring Program Seminar
Temporal (Lagged) Ensemble Related to dprog/dt and lagged-average forecasting (LAF) [Hoffman and Kalnay 1983; Reed et al. 1998; Palmer and Tibaldi 1988; Roebber 1990; Brundage et al. 2001; Hamill 2003] • Palmer and Tibaldi (1988) and Roebber (1990) found lagged forecast spread to be moderately correlated with lagged forecast skill • Roebber (1990) did not look for correlation between lagged forecast spread and current forecast skill Is temporal ensemble spread a useful second predictor of the current forecast skill? NSSL/SPC Spring Program Seminar
X X X X X X X X X X X X X From idealized verification experiments (Grimit et al. 200x) Estimating Forecast Skill: Verification • Choice must be made whether to compare forecasts and verifications in grid-box space or in observation space • Representing data at a scale other than its own inherent scale introduces an error • Verification schemes introduce their own error, potentially masking true forecast error • Fields with large small-scale variability • Low observation density (grid-based) NSSL/SPC Spring Program Seminar
Estimating Forecast Skill: Verification • User-dependency • Scoring metric • Deterministic or probabilistic? • Categorized? • Are timing errors important? [c.f. Mass et al. 2002] NSSL/SPC Spring Program Seminar
Limitations to Forecast Skill Prediction • Definition of forecast skill • Traditional spread approach is inherently deterministic • A fully probabilistic approach requires an accurately forecast PDF • In practice, the PDF is not well forecast • Under-dispersive ensemble forecasts • Under-sampling (distribution tails not well captured) • Unaccounted for sources of uncertainty • Sub-grid scale processes • Systematic model biases • Need to develop superior ensemble generation and/or statistical post-processing to accurately depict the true forecast PDF • Until then, we must find ways to extract flow-dependent uncertainty information from current (suboptimal) ensembles NSSL/SPC Spring Program Seminar
Project Goal • Develop a short-range forecast skill prediction system using an imperfect mesoscale ensemble short-range = 0 – 48 h imperfect = suboptimal; cannot correctly forecast PDF • Estimate the upper-bound of forecast skill predictability • Assess the relationship sensitivity to different metrics • Use existing UW MM5 SREF system – a unique resource • Initialized using an international collection of large-scale analyses • Spatial resolution (12-km grid spacing) • Include spatially- and temporally-dependent bias correction • Use temporal ensemble spread as a secondary predictor of forecast skill, if viable • Attempt a new method of probabilistic forecast skill prediction NSSL/SPC Spring Program Seminar
Simple Stochastic Spread-Skill Model an extension of the Houtekamer (1993) model NSSL/SPC Spring Program Seminar
Spread-Skill Correlation Theory (Houtekamer 1993) 2 1-exp(-2) 2(s,|E|) = ; =std(ln s) 2 1- exp(-2) The Original Simple Stochastic Model s = ensemble standard deviation (spread) = temporal spread variability E = ensemble forecast error (skill) • Spread-skill correlation depends on the time variation of spread • For constant spread day-to-day ( = 0), = 0 • For large spread variability ( ), sqrt(2/) < 0.8 • Assumes that E is the ensemble mean error, infinite ensemble NSSL/SPC Spring Program Seminar
A Modified Simple Stochastic Model • Stochastically simulated ensemble forecasts at a single grid point with 50,000 realizations (cases) • Assume perfect ensemble forecasts • Draw today’s “forecast uncertainty” from a log-normal distribution (Houtekamer 1993 model). • ln( s ) ~ N( ln(sf) , b 2 ) • Create synthetic ensemble forecasts by drawing M values from the “true” distribution (perfect ensemble). • Fi ~ N( Z , s 2 ) ; i = 1,2,…,M • Draw the verifying observation from the same “true” distribution. • V ~ N( Z , s 2 ) • Calculate ensemble spread and skill using varying metrics. • Assumed Gaussian statistics • Varied: • temporal spread variability (b) • finite ensemble size (M) • spread and skill metrics NSSL/SPC Spring Program Seminar
Simple Model Results – Traditional Spread-Skill • STD-AEM correlation increases with spread variability and ensemble size. • STD-AEM correlations asymptote to the H93 values. STD = Standard Deviation AEM = Absolute Error of the ensemble Mean NSSL/SPC Spring Program Seminar
What Measure of Skill? • STD is a better predictor of the average ensemble member error than of the ensemble mean error. _ AEM = | E | ___ MAE = | E | • Different measures of ensemble variation in may be required to predict other measures of skill. spread STD = Standard Deviation error RMS= Root-Mean Square error MAE= Mean Absolute Error AEM= Absolute Error of the ensemble Mean AEC= Absolute Error of a Control NSSL/SPC Spring Program Seminar
Linear? STD-AEM correlation STD-RMS correlation NSSL/SPC Spring Program Seminar
Mesoscale Ensemble Forecast and Verification Data Two suboptimal mesoscale short-range ensembles designed for the U.S. Pacific Northwest NSSL/SPC Spring Program Seminar
The Challenges for Mesoscale SREF • Lagging development of SREF systems compared to large-scale, medium-range ensemble prediction systems. • Limited-area domain (necessity for boundary conditions) may constrain mesoscale ensemble spread. [Errico and Baumhefner 1987; Paegle et al. 1997; Du and Tracton 1999; Nutter 2003] • Error growth due to model deficiency plays a significant role in the short-range. [Brooks and Doswell 1993; Stensrud et al. 2000; Orrell et al. 2001] • Predominantly large-scale linear error growth in the short-range (< 24h). [Gilmour et al. 2001] • IC selection methodologies from medium-range ensembles are not well applied to short-range ensembles • Suboptimal, but highly effective, approach was adopted in 2000 use multiple analyses/forecasts from major operational weather centers NSSL/SPC Spring Program Seminar
Grid Sources for Multi-Analysis Approach Resolution (~@ 45 N ) Objective Abbreviation/Model/Source Type ComputationalDistributed Analysis V avn, Global Forecast System (GFS), Spectral T254 / L64 1.0 / L14 SSI / 3D Var National Centers for Environmental Prediction ~55km ~80km cmcg, Global Environmental Multi-scale (GEM), Spectral T199 / L28 1.25 / L11 3D Var Canadian Meteorological Centre ~100km ~100km eta, Eta limited-area mesoscale model, Finite 12km / L45 90km / L37 SSI / 3D Var National Centers for Environmental Prediction Diff. gasp, Global AnalysiS and Prediction model, Spectral T239 / L29 1.0 / L11 3D Var Australian Bureau of Meteorology ~60km ~80km jma, Global Spectral Model (GSM), Spectral T106 / L21 1.25 / L13OI Japan Meteorological Agency ~135km ~100km ngps, Navy Operational Global Atmos. Pred. System, Spectral T239 / L30 1.0 / L14 OI Fleet Numerical Meteorological & Oceanographic Cntr. ~60km ~80km tcwb, Global Forecast System, Spectral T79 / L18 1.0 / L11 OI Taiwan Central Weather Bureau ~180km ~80km ukmo, Unified Model, Finite 5/65/9/L30 same / L12 3D Var United Kingdom Meteorological Office Diff. ~60km
UW’s Ensemble of Ensembles # of EF Initial Forecast Forecast Name Members Type Conditions Model(s) Cycle Domain ACME17 SMMA 8 Ind. Analyses, “Standard” 00Z 36km, 12km 1 Centroid, MM5 8 Mirrors ACMEcore 8 SMMA Independent “Standard” 00Z 36km, 12km Analyses MM5 ACMEcore+ 8 PMMA “ “ 8 MM5 00Z 36km, 12km variations PME 8 MMMA “ “ 8 “native” 00Z, 12Z 36km large-scale Homegrown Imported SMMA: Single Model Multi-Analysis PMMA: Perturbed-model Multi-Analysis MMMA: Multi-model Multi-Analysis ACME: Analysis-Centroid Mirroring Ensemble PME: Poor Man’s Ensemble MM5: PSU/NCAR Mesoscale Modeling System Version 5
a) b) Configurations of the MM5 short-range ensemble grid domains. (a) Outer 151127 domain with 36-km horizontal grid spacing. (b) Inner 103100 domain with 12-km horizontal grid spacing. Multi-Analysis, Fixed Physics: ACMEcore • Single limited-area mesoscale modeling system (MM5) • 2-day (48-hr) forecasts at 0000 UTC in real-time since Jan. 2000 • Initial Condition Selection: Large-scale, multi-analysis [from different operational centers] • Lateral Boundary Conditions: Prescribed by the corresponding, large-scale forecasts NSSL/SPC Spring Program Seminar
Multi-Analysis, Mixed Physics: ACMEcore+ see Eckel (2003) for further details NSSL/SPC Spring Program Seminar
c T Analysis Region M T 48h forecast Region Temporal (Lagged) Ensemble Using Lagged-Centroid Forecasts Advantages: • Run-to-run consistency of the best deterministic forecast estimate of “truth” (without any weighting) • Less sensitive to a single member’s temporal variability • Yields mesoscale spread [equal weighting of lagged forecasts]
Verification Data: Surface Observations • Network of surface observations from many different agencies • Observations are preferentially located in lower elevations and near urban centers. • Focus in this study is on 10-m wind direction • More extensive coverage & greater # of reporting sites than SLP. • Greatly influenced by regional orography, mesoscale pressure pattern, and synoptic scale changes. • Systematic forecast biases in the other near-surface variables can dominate stochastic errors. • Will also use temperature and wind speed NSSL/SPC Spring Program Seminar
Key Questions • Is there a significant spread-skill relationship in the MM5 ensemble predictions? Can it be used to form a forecast skill prediction system? • Is the spread of a temporal ensemble a useful second predictor of forecast skill? • Is there a significant difference between expected spread-skill correlations indicated by a simple stochastic model and the observed MM5 ensemble spread-skill correlations? • Do the MM5 ensemble spread-skill correlations improve after a simple bias correction is applied? • Are probabilistic error forecasts useful for predicting short-range mesoscale forecast skill? NSSL/SPC Spring Program Seminar
Preliminary Results Observation-based verification of 10-m wind direction Evaluated over one cool season (2002-2003) NSSL/SPC Spring Program Seminar
ACMEcore Spread-Skill Correlations • Latest spread-skill correlations are lower than in early MM5 ensemble work. • Observed STD-RMS correlations are higher than STD-AEM correlations. • ACMEcore forecast skill predictability is comparable to the expected predictability, given a perfect ensemble (with the same spread variability). • Clear diurnal variation—affected by IC & MM5 biases? Ensemble Size = 8 members (AVN, CMC, ETA, GASP, JMA, NOGAPS, TCWB, UKMO) Verification Period: Oct 2002 – Mar 2003 (130 cases) Verification Strategy: Interpolate Model to Observations Variable: 10-m Wind Direction
ACMEcore+ Spread-Skill Correlations • Temporal spread variability (b) decreases! • STD-RMS correlations are higher than and improve more than STD-AEM correlations. • Exceedance of expected and idealized correlations may be due to: • Simple model assumptions • Domain-averaging • Less diurnal variation, but still present—affected by unique MM5 biases? Ensemble Size = 8 members (PLUS01, PLUS02, PLUS03, PLUS04, PLUS05, PLUS06, PLUS07, PLUS08) Verification Period: Oct 2002 – Mar 2003 (130 cases) Verification Strategy: Interpolate Model to Observations Variable: 10-m Wind Direction
Initial Temporal Spread-Skill Correlations • Lagged CENT-MM5 ensemble spread has moderate to strong correlation (r = 0.7 / 0.8) with the lagged CENT-MM5 ensemble skill. • Weaker correlation with current mean skill, but is still a useful secondary predictor. Relatively weak correlation with current ensemble skill NSSL/SPC Spring Program Seminar
Temporal Spread-Skill Correlations • Different results for 2002-2003 season – much weaker correlations. • Preliminary results – could have potential errors in the calculations. • Are model improvements a factor? Difference in component members (added JMA-MM5)? Year-to-year variability? VERY weak correlation with current ensemble skill NSSL/SPC Spring Program Seminar
Summary • Forecast skill predictability depends largely on the definition of skill itself. • User-dependent needs • Spread-skill correlation is sensitive to the spread and skill metrics • For 10-m wind direction, ACMEcore spread (STD) is a good predictor (r = 0.5-0.75) of ensemble forecast skill (RMS). ACMEcore+ STD is slightly better (r = 0.6-0.8). • Larger improvements are expected for T and WSPD. • It is unclear whether the variance of a temporal ensemble (using lagged centroid forecasts from ACME) is a useful secondary forecast skill predictor. NSSL/SPC Spring Program Seminar
Proposed Work Additional cool season of evaluation (2003-2004) Grid-based verification * Forecast skill predictability with bias-corrected forecasts * Other variables (T and WSPD) Categorical approach Probabilistic forecast skill prediction * NSSL/SPC Spring Program Seminar
Verification Data: Mesoscale Gridded Analysis • Reduced concern about impacts of observational errors on results, if observation and grid-based spread-skill relationships are qualitatively similar. • Use Rapid Update Cycle 20-km (RUC20) analysis as “gridded truth” for MM5 ensemble verification and calibration. • Smooth 12-km MM5 ensemble forecasts to RUC20 grid. • Improved analysis could be used in the future. NSSL/SPC Spring Program Seminar
Bias-corrected Forecast Period Bias-corrected Forecast Period Bias-corrected Forecast Period November January December Training Period Training Period Training Period February March 1) Calculate bias at every location and lead time using previous forecasts/verifications N number of forecast cases fi,j,t forecast at location (i,j) and lead time (t) oi,j verification 2) Post-process current forecast using calculated bias: * fi,j,t bias-corrected forecast at location (i,j) and lead time (t) Simple Bias Correction • Overall goal is to correct the majority of the bias in each member forecast, while using shortest possible training period • Will be performed separately using both observations and the RUC20 analysis as verifications
Probabilistic (2nd-order) Forecast Skill Prediction • Even for perfect ensemble forecasts, there is scatter in the spread-skill relationship; error is a multi-valued function of spread • Additional information about the range of forecast errors associated with each spread value could be passed on to the user • Include error bounds with the error bounds… T2m = 3 °C ± 1.5-2.5 °C AEM STD NSSL/SPC Spring Program Seminar
Probabilistic (2nd-order) Forecast Skill Prediction • Ensemble forecast errors (RMS in this case) are divided into categories by spread amount. • A gamma distribution is fit to the empirical forecast errors in each spread bin to form a probabilistic error forecast. • The skill of the probabilistic error forecasts are evaluated using a cross-validation approach and the CRPS. • Forecast skill predictability can be defined as a CRPS skill score: SS = (CRPScli – CRPS) / CRPScli RMS STD RMS NSSL/SPC Spring Program Seminar
“No forecast is complete without a forecast of forecast skill!” -- H. Tennekes, 1987 QUESTIONS? NSSL/SPC Spring Program Seminar
Contributions • Development of a mesoscale forecast skill prediction system • Forecast users (of the Northwest MM5 predictions) will gain useful information on forecast reliability that they do not have now. • Probabilistic predictions of deterministic forecast errors • Probabilistic predictions of average ensemble member errors • Incorporation of a simple bias-correction procedure • This has not been previously accomplished, only suggested • Temporal ensemble spread approach with lagged-centroid forecasts • Extension of a simple stochastic spread-skill model to include sampling effects and non-traditional measures • Idealized verification experiments may provide useful guidance on how mesoscale forecast verification should be conducted NSSL/SPC Spring Program Seminar
An Alternative Categorical Approach • Ensemble mode population is the predictor (Toth et al. 2001) • Largest fraction of ensemble members falling into a bin • Bins are determined by climatologically equally likely classes • Skill measured by success rate • Success, if verification falls into ensemble mode bin • Mode population and statistical entropy (ENT) are better predictors of success rate than STD (Ziehmann 2001) • The key here is the classification of forecast and observed data [c.f. Toth et al. 2001 Fig. 2] 500 hPa Height NH Extratropics NSSL/SPC Spring Program Seminar
spread STD = Standard Deviation ENT*= Statistical Entropy MOD*=Mode Population error AEM= Absolute Error of the ensemble Mean MAE= Mean Absolute Error IGN*= Ignorance * = binned quantity NSSL/SPC Spring Program Seminar
spread STD = Standard Deviation ENT*= Statistical Entropy MOD*=Mode Population skill Success = 0 / 1 * = binned quantity NSSL/SPC Spring Program Seminar
Multiple (Combined) Spread-Skill Correlations • Early results suggested that temporal spread would be a useful secondary predictor. Latest results suggest otherwise. At and below minimum useful correlation. NSSL/SPC Spring Program Seminar
Multiple (Combined) Spread-Skill Correlations NSSL/SPC Spring Program Seminar
Simple Stochastic Model with Forecast Bias NSSL/SPC Spring Program Seminar
Spread-Skill Correlations for Temperature NSSL/SPC Spring Program Seminar