320 likes | 503 Views
statistical validation of numerical models: some methods. Ricardo Lemos. subject index. data setup standard methods of model validation model validation for a single location - time-series analysis model validation for a single instant - spatial data analysis
E N D
statistical validation of numerical models: some methods Ricardo Lemos
subject index • data setup • standard methods of model validation • model validation for a single location - time-series analysis • model validation for a single instant - spatial data analysis • model validation for variable space and time - spatiotemporal data analysis • summary
1. data setup a) deterministic model already calibrated – the whole dataset is used to validate the model b) deterministic model needs calibration – data subsetting according to the purpose of the numerical model (description vs. prediction) space space space space time time time calibration validation random subsampling subsampling with the aim of forecasting
model validation «Because there is not a single best performance measure or best evaluation methodology, it is recommended that a suite of different performance measures be applied.» (Chang and Hanna, 2004) Chang, J.C., Hanna, S.R., 2004. Air quality model performance evaluation. Meteorol Atmos Phys 87: 167–196
2. standard methods of model validation WWRP/WGNE Joint Working Group on Verification Forecast Verification - Issues, Methods and FAQ Introduction - what is this web site about? Issues:Why verify?Types of forecasts and verificationWhat makes a forecast good?Forecast quality vs. valueWhat is "truth"?Validity of verification resultsPooling vs. stratifying results Methods:Standard verification methods:Methods for dichotomous (yes/no) forecastsMethods for multi-category forecastsMethods for forecasts of continuous variablesMethods for probabilistic forecastsScientific or diagnostic verification methods:Methods for spatial forecastsMethods for probabilistic forecasts, including ensemble prediction systemsOther methodsSample forecast datasets:Finley tornado forecastsSydney 2000 Forecast Demonstration Project radar-based rainfall nowcasts ..... climate example ..... Some Frequently Asked Questions Discussion group References:Links to other verification sitesReferences and further readingContributors to this site World Weather Research Program / Working Group on Numerical Experimentation http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.shtml
2. standard methods of model validation methods: a) compare raw model predictions and observed data b) analyse residuals (observed – predicted)
2. standard methods of model validation a) compare raw model predictions and observed data a1) “eyeball“ verification: a2) straightforward statistical analysis - steps: i. define important features of the data ii. quantify them in some way - “statistical probes” (Kendall et al., 1999) iii. investigate to what extent those features are captured by the model Kendall B.E., et al., 1999. Why do populations cycle? A synthesis of statistical and mechanistic modeling approaches. Ecology 80(6): 1789-1805
2. standard methods of model validation b) analyse residuals (observed – predicted) rationale: if the model performs well, it should closely follow the observations and leave white noise only overfitted? residual validated model residual b1) “eyeball“ verification: time or space residual incomplete? b2) straightforward statistical analysis: periodogram autocorrelation plot significant lag-1 autocorrelation in residuals unmatched periodicity
3. model validation for a single location - time-series analysis
3. model validation for a single location - time-series analysis • methods: • i. compare the performance of the numerical model with the performance of statistical time-series models: • Autoregressive Integrated Moving Average Models (ARIMA) • Bayesian Dynamic Linear Models (DLM) • Analogue Forecasting Models • this method requires subsetting in order to build the statistical models. ii. examine in detail the performance of the numerical model • models for known periodicities • bootstrap R • process convolutions
3. model validation for a single location - time-series analysis i. compare the performance of the numerical model with the performance of statistical time-series models: a) AutoregressiveIntegrated Moving Average Models (ARIMA; Box and Jenkins, 1976) T [ºC] t+2 t+1 t+3 time Xt+1=0.9Xt-0.2Xt-1+et et~N(0,1.2) ARIMA models can contain seasonal components. Box, G. E. P., Jenkins, G. M. 1976. Time Series Analysis: forecasting and control. Holden Day, Oakland, CA.
3. model validation for a single location - time-series analysis • i. compare the performance of the numerical model with the performance of statistical time-series models: • b) Bayesian Dynamic Linear Models (West and Harrison, 2000) T [ºC] t+1 time Xt+1=atXt+btXt-1+et at=at-1+dt bt=bt-1+gt et~N(0,1.2) dt~N(0,0.05) gt~N(0,0.02) West, M., Harrison, J., 2000. Bayesian Forecasting and Dynamic Models. Springer-Verlag, NY.
3. model validation for a single location - time-series analysis • i. compare the performance of the numerical model with the performance of statistical time-series models: • c) Analogue Forecasting Models(McNames, 2002) T [ºC] t+1 t+1 t+1 time A1 A2 L Xt+1,L=0.7Xt+1,A1+0.3Xt+1,A2+e e~N(0,1.2) McNames, J. 2002. Local averaging optimization for chaotic time series prediction, Neurocomputing 48(1-4): 279-297
3. model validation for a single location - time-series analysis i. compare the performance of the numerical model with the performance of statistical time-series models T [ºC] time observations numerical model ARIMA Bayesian DLM analogue forecasting model
3. model validation for a single location - time-series analysis i. compare the performance of the numerical model with the performance of statistical time-series models Taylor, K.E. 2001. Summarizing multiple aspects of model performance in a single diagram. J Geophys Res 106(D7): 7183–7192
3. model validation for a single location - time-series analysis ii. examine in detail the performance of the numerical model • models for known periodicities e.g.: is the numerical model emulating the major tide components? model for the observations: model for the numerical model output: if, for example, d2 is significantly different from 0, we may conclude that the model is not reproducing well the f1 periodicity.
3. model validation for a single location - time-series analysis ii. examine in detail the performance of the numerical model b) bootstrap R (Mudelsee, 2003) – time-series usually have positive serial dependence, a.k.a. persistence (i.e., lagged autocorrelations are significant and positive). This affects the estimation of confidence intervals for the cross-correlation (R) T [ºC] time observations numerical model Mudelsee, M., 2003. Estimating Pearson’s Correlation Coefficient With Bootstrap Confidence Interval From Serially Dependent Time Series. Mathematical Geology 35(6): 651-665
3. model validation for a single location - time-series analysis ii. examine in detail the performance of the numerical model c) process convolutions (Higdon, 2002) – help to define time periods where observations and predictions differ significantly. Should be applied to residuals (observations – predictions) T [ºC] observations numerical model time residual [ºC] 95% confidence band time 0 observational missing values wider confidence bands significant model misfit Higdon, D., 2002. Space and space-time modeling using process convolutions. In Quantitative Methods for Current Environmental Issues, eds. C. Anderson, V. Barnett, P. C. Chatwin, and A. H. El-Shaarawi, 37–56. London: Springer-Verlag
4. model validation for a single instant – spatial data analysis
4. model validation for a single instant – spatial data analysis T[ºC] output of the numerical model in-situ measurements • methods: • direct comparison between numerical model and observations • a) figure of Merit in Space (FMS) / measure of effectiveness (MOE) • b) entity-based verification • residual analysis • a) process convolutions
4. model validation for a single instant – spatial data analysis • direct comparison between numerical model and observations • a) figure of merit in space (FMS) / measure of effectiveness (MOE) T[ºC] T2 AP: T1<TP<T2 T1 output of the numerical model (predictions) in-situ measurements (observations) AO: T1<TO<T2 AP AFalse Positive AP∩AO AFalse Negative AO
4. model validation for a single instant – spatial data analysis • direct comparison between numerical model and observations • a) figure of merit in space (FMS) / measure of effectiveness (MOE) 0º d AP∩AO d AP AFalse Positive Azimuth [º] AFalse Negative 45º AO 0º 45º 90º 90º this is a simple statistical approach, with easy interpretation and potential impact on decision-makers. However, it depends on some subjective criteria that have a strong impact on the outcome: boundaries (T1 and T2), interpolation algorithm, interpolation smoothness; the density and location of the observations is also important.
4. model validation for a single instant – spatial data analysis • direct comparison between numerical model and observations • b) entity-based verification (Ebert and McBride, 2000) the total mean squared error (MSE) can be written as: MSEtotal = MSEdisplacement + MSEvolume + MSEpattern the difference between the mean square error before and after translation is the contribution to total error due to displacement, MSEdisplacement = MSEtotal – MSEshifted the error component due to volume represents the bias in mean intensity, MSEvolume = ( F - X )2 where F and X are the entity’s mean forecast and observed values after the shift. The pattern error accounts for differences in the fine structure of forecast and observed fields MSEpattern = MSEshifted - MSEvolume Ebert, E.E., McBride, J.L. 2000. Verification of precipitation in weather systems: Determination of systematic errors. J. Hydrology 239: 179-202.
4. model validation for a single instant – spatial data analysis ii. residual analysis • a) process convolutions (Higdon, 2002) z 0 x 0 95% confidence interval y 0
5. model validation for variable space and time - spatiotemporal data analysis
5. model validation for variable space and time - spatiotemporal data analysis • methods: • analyse observations and predictions at a single location (time-series analysis) or time instant (spatial data analysis) – see sections 3 & 4 • residual analysis – dynamic process convolutions
Time 2 Time 3 Time 1 S(., b1) S(., b3) S(., b2) 5. model validation for variable space and time - spatiotemporal data analysis • ii. residual analysis - dynamic process convolutions (Higdon, 2002) Spatial Process Noise Residuals = + yi = ei S(xi, b2) +
6. summary in essence, two validation approaches were proposed: • signal analysis – used to investigate to what extent the most important features of the data are captured by the numerical model • residual analysis – used to investigate if some significant features were left out by the numerical model a third option is available: compare the performance of the numerical model with that of statistical models (ARIMA, DLMs, etc.).