Ensemble Forecasting: Calibration, Verification, and use in Applications

Ensemble Forecasting: Calibration, Verification, and use in Applications Tom Hopson

Outline • Motivation for ensemble forecasting and post-processing • Introduce Quantile Regression (QR; Kroenker and Bassett, 1978)post-processing procedure • Ensemble forecast verification • Thorpex-Tigge data set • Ensemble forecast examples: • a) Southwestern African flooding • b) African meningitis • c) US Army test range weather forecasting • d) Bangladesh flood forecasting

Goals of an Ensemble Prediction System (EPS) • Predict the observed distribution of events and atmospheric states • Predict uncertainty in the day’s prediction • Predict the extreme events that are possible on a particular day • Provide a range of possible scenarios for a particular forecast

More technically … • Greater accuracy of ensemble mean forecast (half the error variance of single forecast) • Likelihood of extremes • Non-Gaussian forecast PDF’s • Ensemble spread as a representation of forecast uncertainty => All rely on forecasts being calibrated • Further … • -- Argue calibration essential for tailoring to local application: NWP provides spatially- and temporally-averaged gridded forecast output • -- Applying gridded forecasts to point locations requires location specific calibration to account for local spatial- and temporal-scales of variability ( => increasing ensemble dispersion)

Note: obs Probability Forecast PDF Take home message: For a “calibrated ensemble”, error variance of the ensemble mean is 1/2 the error variance of any ensemble member (on average), independent of the distribution being sampled Discharge

Forecast “calibration” or “post-processing” “bias” obs Forecast PDF Probability Probability Forecast PDF obs “spread” or “dispersion” calibration Flow rate [m3/s] Flow rate [m3/s] • Post-processing has corrected: • the “on average” bias • as well as under-representation of the 2nd moment of the empirical forecast PDF (i.e. corrected its “dispersion” or “spread”) • Our approach: • under-utilized “quantile regression” approach • probability distribution function “means what it says” • daily variation in the ensemble dispersion directly relate to changes in forecast skill => informative ensemble skill-spread relationship

Rank Histograms – measuring the reliability of an ensemble forecast • You cannot verify an ensemble forecast with a single observation. • The more data you have for verification, (as is true in general for other statistical measures) the more certain you are. • Rare events (low probability) require more data to verify => as do systems with many ensemble members. From Barb Brown

From Tom Hamill

Troubled Rank Histograms Counts 0 10 20 30 Counts 0 10 20 30 1 2 3 4 5 6 7 8 9 10 Ensemble # 1 2 3 4 5 6 7 8 9 10 Ensemble # Slide from Matt Pocernic

From Tom Hamill

Example of Quantile Regression (QR) Our application Fitting T quantiles using QR conditioned on: Ranked forecast ens ensemble mean ensemble median 4) ensemble stdev 5) Persistence R package: quantreg

Step 2: For each quan, use “forward step-wise cross-validation” to iteratively select best subset Selection requirements: a) QR cost function minimum, b) Satisfy binomial distribution at 95% confidence If requirements not met, retain climatological “prior” Step I: Determine climatological quantiles Probability/°K climatological PDF 1. Regressor set: 1. reforecast ens 2. ens mean 3. ens stdev 4. persistence 5. LR quantile (not shown) 3. T [K] 2. 4. Temperature [K] observed forecasts Time Step 3: segregate forecasts into differing ranges of ensemble dispersion and refit models (Step 2) uniquely for each range Final result: “sharper” posterior PDF represented by interpolated quans forecasts Forecast PDF posterior I. II. III. II. I. Probability/°K prior T [K] Temperature [K] Time

Rank Probability Score for multi-categorical or continuous variables

Scatter-plot and Contingency Table Brier Score Does the forecast detect correctly temperatures above 18 degrees ? y = forecasted event occurence o = observed occurrence (0 or 1) i = sample # of total n samples => Note similarity to MSE Slide from Barbara Casati

Other post-processing approaches … 1) Bayesian Model Averaging (BMA) – Raftery et al (1997) 2) Analogue approaches – Hopson and Webster, J. Hydromet (2010) 3) Kalman Filter with analogues – DelleMonache et al (2010) 4) Quantile regression – Hopson and Hacker, MWR (under review) 5) quantile-to-quantile (quantile matching) approach – Hopson and Webster J. Hydromet (2010) … many others

Quantile Matching: another approach when matched forecasts-observation pairs are not available => useful for climate change studies ECMWF 51-member Ensemble Precipitation Forecasts compared To observations • 2004 Brahmaputra Catchment-averaged Forecasts • black line satellite observations • colored lines ensemble forecasts • -Basic structure of catchment rainfall similar for both forecasts and observations • -But large relative over-bias in forecasts

Forecast Bias Adjustment • done independently for each forecast grid • (bias-correct the whole PDF, not just the median) Model Climatology CDF “Observed” Climatology CDF Pmax Pmax Precipitation Pfcst Padj 25th 50th 75th 100th 25th 50th 75th 100th Quantile Quantile In practical terms … ranked forecasts ranked observations 0 1m 0 1m Precipitation Precipitation Hopson and Webster (2010)

Bias-corrected Precipitation Forecasts Original Forecast Brahmaputra Corrected Forecasts Corrected Forecast => Now observed precipitation within the “ensemble bundle”

THORPEX Interactive Grand Global Ensemble • TIGGE, the THORPEX Interactive Grand Global Ensemble • component of the World Weather Research Programme • TIGGE archive consists of ensemble forecast data from ten global NWP centers • designed to accelerate the improvements in the accuracy of 1-day to 2 week high-impact weather forecasts for the benefit of humanity. • starting from October 2006 • available for scientific research • near-real time forecasts (some centers delayed)

Archive Status and Monitoring, Data Receipt UKMO CMC CMA ECMWF MeteoFrance NCAR NCEP JMA KMA NCDC IDD/LDM HTTP FTP Archive Centre CPTEC Current Data Provider BoM Unidata IDD/LDM Internet Data Distribution / Local Data Manager Commodity internet application to send and receive data

Archive Status and Monitoring, Variability between providers

Archive Status and Monitoring, Archive Completeness PL = Pressure Level, PT = 320K θ Level, PV = ± 2 Potential Vorticity Level, SL = Single/Surface Level

Archive Status and Monitoring, Archive Completeness PL = Pressure Level, PT = 320KθLevel, PV = ± 2 Potential Vorticity Level, SL = Single/Surface Level

Early May 2011, floods in southwestern Africa

Early May 2011, floods in southwestern Africa -- examine ens forecasts … ECMWF 24hr precip

Early May 2011, floods in southwestern Africa -- examine ens forecasts … NCEP GEFS 24hr precip

Early May 2011, floods in southwestern Africa -- examine ens forecasts … ECMWF 5-day precip

Early May 2011, floods in southwestern Africa -- examine ens forecasts … NCEP GEFS 5day precip

Ensemble Forecasting: Calibration, Verification, and use in Applications