1 / 40

Acknowledgement: Malaquias Pena, Ake Johansson, Wanqiu Wang,Tony Barnston, Suranjana Saha

Methods of Multi-Model Consolidation, with Emphasis on the Recommended Cross Validation Approach Huug van den Dool CTB seminar, May, 11, 2009. Acknowledgement: Malaquias Pena, Ake Johansson, Wanqiu Wang,Tony Barnston, Suranjana Saha. Traditional Anomaly Correlation. F’ = (F - C obs )

iris-campos
Download Presentation

Acknowledgement: Malaquias Pena, Ake Johansson, Wanqiu Wang,Tony Barnston, Suranjana Saha

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methods of Multi-Model Consolidation, with Emphasis on the Recommended Cross Validation ApproachHuug van den DoolCTB seminar, May, 11, 2009 Acknowledgement: Malaquias Pena, Ake Johansson, Wanqiu Wang,Tony Barnston, Suranjana Saha

  2. Traditional Anomaly Correlation F’ = (F - Cobs) A’ = (A - Cobs) Forecast, verifying Analysis, Climatology AC = Σ F’ A’ / (Σ F’F’ Σ A’ A’)1/2 Summation is in space, or in space and time. Weighting may be involved. Cobs is known at the time the forecast is made, i.e. determined from previous data. A (and F obviously) are not part of the sample from which C is calculated Relationship of AC (skill) to MSE . AC is calculated from ‘raw’ data.

  3. New trend due to availability of hindcast data sets: F“ = (F - Cmdl) A’ = (A - Cobs) and, Cobs tends to be calculated from data that matches the model data.

  4. Short-Cut Anomaly Correlation F“ = (F - Cmdl) A’ = (A - Cobs) ACsc = Σ F” A’ / (Σ F”F” Σ A’ A’)1/2 F” = (F - Cmdl) = (F – Cobs) - (Cmdl - Cobs)) F” = F’ - (Cmdl – Cobs) (1) Using F” amounts to a systematic error correction (SEC) , which requires a cross-validation (CV) to be honest. {{ Eq (1) becomes more involved if the periods for Cmdl and Cobs are not the same.}}

  5. Why do we need CV? • To obtain an estimate of skill on future (independent) data. While there is no substitute for real time forecasts on future data, a CV procedure attempts to help us out (without having to wait too long) • Leaving N years out of a sample of M creates N independent data points. Or does it?? • Details of CV procedures used by authors are exceedingly ad-hoc and often wrong • We recommend 3CVRE

  6. Meaning of 3CVRE • Leave 3 years out (3 as a minimum) • R: Leave 3 years out, namely the test year plus two others chosen at Random, see example • E: Use ‘External’ observed climatology, not an observed climatology that changes in response to leaving out a particular set of 3 years.

  7. Example 1981-2001. Three years left out. First year is test year. The other two are picked at random. years left out 1981 1985 1989 years left out 1982 2000 1989 years left out 1983 1990 1998 years left out 1984 1993 1981 years left out 1985 1992 1995 years left out 1986 1999 1987 years left out 1987 1996 1989 years left out 1988 1988 1989 years left out 1989 1983 1992 years left out 1990 1985 2000 years left out 1991 1990 2001 years left out 1992 1996 2001 years left out 1993 1985 1995 years left out 1994 1989 1991 years left out 1995 1986 1996 years left out 1996 1991 1990 years left out 1997 1991 1990 years left out 1998 1991 1988 years left out 1999 2001 1995 years left out 2000 2001 1991 years left out 2001 1998 1999

  8. Why leave three out?, as opposed to just one. Two very different reasons • Anomaly Correlation does not change between ‘raw’ and CV-1-out. (This can be shown analytically) • CV-1-out leads to serious ‘degeneracy’ problems when the forecast involves a regression (as it does for MME with unequal weights) and skill is not that high to begin with (applies unfortunately)

  9. M. Peña Mendez and H. van den Dool, 2008: Consolidation of Multi-Method Forecasts at CPC. J. Climate, 21, 6521–6538. Unger, D., H. van den Dool, E. O’Lenic and D. Collins, 2009: Ensemble Regression. Manuscript Accepted Monthly Weather Review2009 early online release, posted January 2009 DOI: 10.1175/2008MWR2605.1 (1) CTB, (2) why do we need ‘consolidation’?

  10. Context: Consolidation of Several Models

  11. OFFicial Forecast(element, lead, location, initial month) = a * F1 + b * F2 + c * F3 +…Honest hindcast required 1950-present. Covariance (F1, F2), (F1, F3), (F2, F3), and(F1, A), (F2, A), (F3, A) allows solution for a, b, c (element, lead, location, initial month)

  12. CON is color blind

  13. Apply to: • Monthly SST, 1981-2001, 4 starts, leads 1-5 • 9 models • Domain is 20S-20N Pacific Ocean • (gridpoints, not Nino34 index) M. Peña Mendez and H. van den Dool, 2008: Consolidation of Multi-Method Forecasts at CPC. J. Climate, 21, 6521–6538.

  14. Table 1. Some information on the DEMETER-PLUS models * Institutions developing these models: European Center for Medium Range Forecasts, Max Plank-Institute, Meteo-France, United Kingdom Met Office, Instituto Nazionale de Geofisica e Vulcanology, Laboratoire d’Oceanographie Dynamique et de Climatologie, European Centre for Research and Advanced Training in Scientific Computation.

  15. K CON = Σαk SST k k = 1 i.e. a weighted mean over K model estimates One finds the K alphas typically by minimizing the distance between CON and observed SST.

  16. Classic or Unconstrained Regression (UR) The general problem of consolidation consists of finding a vector of weights, α, that minimizes the Sum of Square Errors, SSE, given by the following expression: SSE = (Zα - o)T(Zα - o) (5) Then leads to ZTZα = ZTo So the weights are formally given by α = A-1 b (6) where A = ZTZ is the covariance matrix, and b=Zto . Equation (6) is the solution for the ordinary (Unconstrained) linear Regression (UR).

  17. Why ridge regression?One of the preferred methods that: • Tries minimize damage due to overfit (too many coefficients from too little data) • Tries to handle co-linearity as much as possible • Has a smaller difference in correlation (MSE) for dependent and independent data

  18. Essentially, ridging is a multiple linear regression with an additional penalty term to constrain the size of the squared weights in the minimization of SSE (5): J = (Zα - o)T(Zα - o) + λαTα(7) Minimization of J leads to α = ( A + λ I ) -1b (8) where I is the identity matrix, and , the regularization (or ridging) parameter, indicates the relative weight of the penalty term. Similarities between the ridging and Bayesian approaches for determining the weights have been discussed by Hsiang (1976) and Delsole (2007). In the Bayesian view, (8) represents the posterior mean probability of α, based on a normal a priori parameter distribution with mean zero and variance matrix (σ2/λ)I, where σ2I is the matrix variance of the regression residual, assumed to be normal with a mean zero.

  19. (Delsole 2007)

  20. RIW RI RIM Climo UR MMA COR

  21. SEC SEC and CV 3CVRE

  22. Mdl 4 anomaly Obs anomaly year SEC 25.5 .7 26.8 -.4 1981 2.45 25.9 1.1 28.1 .9 1982 2.45 23.8 -.9 27.1 -.1 1983 2.45 23.5 -1.3 26.7 -.5 1984 2.45 24.1 -.7 26.7 -.5 1985 2.45 26.0 1.3 27.4 .2 1986 2.45 26.6 1.9 28.8 1.6 1987 2.45 23.6 -1.1 25.6 -1.6 1988 2.45 26.2 1.5 26.7 -.5 1989 2.45 25.8 1.1 27.3 .1 1990 2.45 23.5 -1.2 27.9 .7 1991 2.45 24.4 -.3 27.5 .4 1992 2.45 24.4 -.3 27.6 .4 1993 2.45 23.5 -1.2 27.3 .1 1994 2.45 22.9 -1.8 27.0 -.2 1995 2.45 25.6 .9 27.1 -.1 1996 2.45 25.8 1.1 28.9 1.7 1997 2.45 23.4 -1.3 25.9 -1.2 1998 2.45 24.5 -.2 26.3 -.8 1999 2.45 25.0 .3 26.7 -.5 2000 2.45 25.2 .4 27.3 .1 2001 2.45 24.7 .0 27.2 .0 all No CV

  23. Mdl 4 anomaly Obs anomaly year SEC 25.5 .9 26.8 -.4 1981 2.62 25.9 1.3 28.1 .9 1982 2.62 23.8 -.9 27.1 -.1 1983 2.46 23.5 -1.3 26.7 -.5 1984 2.44 24.1 -.8 26.7 -.5 1985 2.32 26.0 1.4 27.4 .2 1986 2.56 26.6 2.0 28.8 1.6 1987 2.63 23.6 -.8 25.6 -1.6 1988 2.73 26.2 1.5 26.7 -.5 1989 2.48 25.8 1.1 27.3 .1 1990 2.54 23.5 -1.2 27.9 .7 1991 2.42 24.4 -.3 27.5 .4 1992 2.49 24.4 -.5 27.6 .4 1993 2.32 23.5 -1.3 27.3 .1 1994 2.38 22.9 -1.8 27.0 -.2 1995 2.48 25.6 .9 27.1 -.1 1996 2.45 25.8 1.0 28.9 1.7 1997 2.36 23.4 -1.4 25.9 -1.2 1998 2.37 24.5 -.3 26.3 -.8 1999 2.42 25.0 .2 26.7 -.5 2000 2.41 25.2 .5 27.3 .1 2001 2.50 24.7 .0 27.2 .0 all 3CVRE

  24. Conclusions MME • MMA is an improvement over individual models • It is hard to improve upon an equal weight ensemble average (MMA). Only WestPac SST show some improvement as per ridge regression • This is caused by (very) deficient data set length. We need 5000 years, not 25. • Pooling gridpoints, pooling various start times and leads, throwing out ‘bad’ models upfront and using all ensemble members helps • Equal treatment for very unequal methods is …. • RIW and COR make sense, because this is what CPC does subjectively. • As should have been expected: UR is really bad

  25. ACsc plus CV AC (raw) ACsc

  26. Why leave three out?, as opposed to just one. Two very different reasons • Anomaly Correlation does not change between ‘raw’ and CV-1-out. (This can be shown analytically) • CV-1-out leads to serious ‘degeneracy’ problems when the forecast involves a regression (as it does for MME with unequal weights) and skill is not that high to begin with (applies unfortunately)

  27. Bayesian Multimodel Strategies • Linear regression leads to unstable weights for small sample sizes. • Methods for producing more stable estimates have been proposed by van den Dool and Rukhovets (1994), Kharin and Zwiers (2002), Yun et al. (2003), and Robertson et al. (2004). • These methods are special cases of a Bayesian method, each distinguished by a different set of prior assumptions (DelSole 2007). • Some reasonable prior assumptions: • R:0 Weights centered about 0 and bounded in magnitude • (ridge regression) • R:MM Weights centered about 1/K (K = # models) and bounded in magnitude • R:MM+R Weights centered about an optimal value and bounded in magnitude • R:S2N Models with small S2N (signal-to-noise) ratio tend to have small weights • LS Weights are unconstrained (ordinary least squares) From Jim Kinter (Feb 2009)

  28. If the multimodel strategy is carefully cross validated, then the simple mean beats all other investigated multimodel strategies. • Since Bayesian methods involve additional empirical parameters, proper assessment requires a two-deep cross validation procedure. This can change the conclusion about the efficacy of various Bayesian priors. • Traditional cross validation procedures are biased and incorrectly indicate that Bayesian schemes beat a simple mean. From Jim Kinter (Feb 2009)

  29. Concluding comments CV • CV is done because ……. • Does CV lower skill??? • CV procedures are quite complicated, full of traps. (The price we pay for impatience) • Is there an all-purpose CV approach? • 1-out procedures may be problematic for several reasons • 3CVRE appears appropriate for (our) MME study.

  30. --------- OUT TO 1.5 YEARS -------

More Related