1 / 18

An analysis of different bias-correction algorithms in a synthetic environment

An analysis of different bias-correction algorithms in a synthetic environment. Joo-Hyung Son 1 Zoltan Toth 2 and Dingchen Hou 3 1) Numerical Weather Prediction Division KMA 2) Environmental Modeling Center NCEP/NWS/NOAA 3) EMC/NCEP/NWS/NOAA and SAIC. OUTLINE. Introduction

kory
Download Presentation

An analysis of different bias-correction algorithms in a synthetic environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An analysis of different bias-correction algorithms in a synthetic environment Joo-Hyung Son1 Zoltan Toth2 and Dingchen Hou3 1)Numerical Weather Prediction Division KMA 2)Environmental Modeling Center NCEP/NWS/NOAA 3)EMC/NCEP/NWS/NOAA and SAIC

  2. OUTLINE • Introduction • Generation of a Synthetic Data Set • Effects of Sample size on the Bias Estimation • Bias Estimation Based on Bayesian Approach • Effect of Bias Correction on Probabilistic Forecast • Summary

  3. Introduction Background • NWP products is subject to systematic error and random errors. • Estimating bias from historical data and then subtracting it from the forecast provides an effective way of reducing systematic errors. Existing Questions • How to estimate the Bias? There exist various methods of bias correction, e.g. equal weight method and Kalman Filter type algorithm (Cui et al, 2005). • What is the length of the historical data set required for a reasonable accuracy of bias estimation? No systematic investigations. This Study – A Simplified Approach • Single forecast of a single variable at a single grid point. • Simulated forecast (synthetic data )--- no dynamic evolution. • Simulated forecast of various skill (lead time) and bias level. • Simulation can be extended to represent more realistic forecasts.

  4. Daily climate data Climate mean Climate standard deviation Generation of synthetic data - analysis • Assumptions • Remove annual cycle • Standardized • Stationary process • Analysis • General ARMA(p,q) model • Order of autoregressive • Order of moving average • White noise • Autocorrelation parameter • Moving average parameter Aotocorrelation 1.2 • Estimate parameters based on • 40 years climate data at 37.5N, 117.5W • 2m temperature 1 0.8 q = 1 p = 20 0.6 0.4 0.2 0 0 50 100 150 200 250 300 350 400 -0.2

  5. 4 3 2 1 0 0 365 730 -1 -2 -3 Climate generated byARMA(20,1) Generation of synthetic data - analysis Time series of analysis

  6. analysis generated by ARMA model, N(0,1) • forecast, N(0.1) : forecast error, N(0,1) • bias, constant • correlation between forecast and analysis Generation of synthetic data - forecast Requirements: • The time series of analysis and forecast are similar stationary stochastic processes. • Forecast is correlated to analysis with a coefficient reflecting the skill of the forecastfor perfect correlation and non-correlated forecast. (simulate lead time 1 to 16 days) • Forecast is subject to random error (independent of analysis) with various variance  (=1 no skill, =0 no noise). • Forecast is statistically the same as analysis (N(0,1)). This is satisfied by setting  =sqrt(1-**2). • A constant (time independent) bias is added to the forecast. Model:

  7. Generation of synthetic data - forecast

  8. Comparison between Real data & Synthetic data Purple line: • “prediction” of how the forecast would look. • Normal forecast distribution centered on alpha times a, • : correlation estimated based on whole observation period • : mean of all analysis values falling between 3 and 4. • : standard deviation of forecast when corresponding analysis is between 3 and 4 Histogram: • Forecast after moving bias Testing Synthetic forecast model against real forecast data

  9. 0.5 0.5 0.5 0.5 day 3 10 day day 10 day 16 0.45 0.45 0.45 0.45 0.4 0.4 0.4 0.4 0.35 0.35 0.35 0.35 0.3 0.3 0.3 0.3 0.25 0.25 0.25 0.25 0.2 0.2 0.2 0.2 0.15 0.15 0.15 0.15 0.1 0.1 0.1 0.1 0.05 0.05 0.05 0.05 0 0 0 0 -5 0 5 10 20 -20 -15 -10 15 -20 -20 -15 -15 -10 -10 -5 -5 -5 0 0 0 5 5 5 10 10 10 15 15 20 20 20 -20 -15 -10 15 Testing Synthetic forecast model against real forecast data mean

  10. Bias-correction algorithms • Traditional method (method 1) • Bias ~ weighted average of • Bias Estimation • Equal weight • Kalman Filter • Bias Correction : Kalman Filter weight

  11. Kalman filter absolute bias error for 100 cases Absolute bias error of Method 1 Red points: the point of equal weighting bias error corresponding to the average of the KF bias error from 1001 to 10000 based on the correlation (~120)

  12. Given the forecast model For a particular For longer time series to sample, the whole distribution of , i.e. : Kalman Filter weight Bias-correction algorithms • New method (method 2) • Based on Bayesian Approach • Bias ~ weighted average of Note without sampling the whole distribution of shorter time series • Bias Estimation • Equal weight • Kalman Filter • Bias correction • Traditional method (method 1)

  13. Absolute bias error of Method 2 Red points: the point of equal weighting bias error corresponding to the average of the KF bias error from 1001 to 10000 based on the correlation (~90) Kalman Filter Absolute bias error of 100 cases

  14. Equal weight method Sample size required for the error to be less than a specific percentage of real bias m1 m2 m1 m2 Comparison of Methods 1 & 2

  15. BIAS (Kalman Filter, method 1) 0.25 0.2 0.15 bias 0.1 0.05 0 correlation 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 lead time(day) 1 2 3 4 5 6 8 10 11 13 16 0.95 0.75 0.20 Effects on ensemble based probabilistic forecast Continuous Ranked Probability Score (CRPS) test • Assumption • Uncertainty is perfectly known (no bias in 2nd momentum) • Forecast • Bias increases with lead time (decreases with correlation) • Modified bias • Bias is standardized by climate standard deviation

  16. , CDF analysis Effects on ensemble based probabilistic forecast Continuous Ranked Probability Score (CRPS) test • Ensemble distribution = forecast uncertainty • PDF of forecast , • CRPS

  17. For synthetic forecast with error levels larger than that in real forecast Effects on ensemble based probabilistic forecast Continuous Ranked Probability Score (CRPS) test For synthetic forecast with error levels similar to that in real forecast Raw fcst 100 warming period 5000 warming period For synthetic forecast with error levels similar to that in real forecast

  18. Summary • Working with synthetic analysis/forecast data sets is useful in the investigation of the performance of various statistical bias correction methods. (quick assessment/comparison) • Bayesian type bias estimation method may have the additional benefits (bias error). • Bias error is independent of bias level, but the probabilistic forecast error can be reduced as the bias is larger. • Need to consider realistic ensemble forecast and more complex bias estimation algorithms (comparing frequency and Bayesian approaches).

More Related