Seasonal climate prediction using linear weighted multi model system W. T. Yun APCN/

Seasonal climate prediction using linear weighted multi model system W. T. Yun APCN/ Korea Meteorological Administration

Contents • Introduction • What is Multi Model Ensemble? • Construction of Multi Model Ensemble System • - Gauss-Jordan Elimination • - Singular Value Decomposition (SVD) • - Synthetic multi model ensemble • - Generating of Synthetic Dataset • Multi Model Ensemble Seasonal Forecast • Skill of Multi Model Forecast • Application

Introduction Regional climate change and climate variability have various impacts on the socio-economic activities. The impacts increase as the socio-economic activities become complex and active. One of important and challenging task in areas of meteorology is climate seasonal prediction. The advance climate seasonal prediction of droughts, monsoon etc. is now scientifically feasible. This can be enormously beneficial in national planning, e.g. in areas of water resources management, disaster management, and agricultural planning and food production.

What is multi model ensemble? • Multi Model Ensemble • An Ensemble comprising different models • weighted Multi Model Ensemble • Weighted Combination of Multi Models

Multi-Model Ensemble Anomaly Forecast? Biased Ensemble Mean Bias Corrected Ensemble Mean weighted Combination of Multi Models

Why Multi-Model Ensemble Forecast? Superensemble Forecast AMIP Model Forecasts (Dec. 1988) Obs ECMWF Sup GFDL Sup-Obs MPI

Construction of Multi-Model Ensemble • Linear statistical methods Prediction The climate system can be regarded as a dynamic nonlinear system • Nonlinear statistical methods, Artificial neural network methods

Neural Network Model with Back-Propagation Input layer xk A linear combination of the neurons in the layer just before the output layer. The cost function is minimized by means of gradient descent. Whole vector of weight are updated according to the back propagation learning rule. The learning can be more efficient by including a momentum term, which refers to previous updating. Local minimum in the network can be avoided by introducing noise to the gradient descent updating rule, which in the case considered here is following Manhattan updating rule. Hidden layer hj x1 Output layer yi x2 Prediction x3 · · · . · · · · · Error ij xk jk A feed-forward neural network with one hidden layer, where the jth neuron in this hidden layer is assigned the value hj.

Skill Score of Non-Linear Multi-Model Ensemble Forecast R M S Bias Corrected Climatology ANN Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec RMSE of Global Precipitation for 12Months (Jan.-Dec. 1988) ANN Forecasts (using AMIP data)

Construction of weighted linear Multi- Model Ensemble Prediction System

Multi-Model Ensemble Prediction System Training Phase Forecast Phase Observed Analysis t=0 MME Forecast

Weighted Multi-Model Ensemble Techniques A B C D MME with Pointwise Regression (1) E … t=t-1 t=0 A A B B C C D D MME with Spatio-Temporal Regression (3) E (2) MME with Pattern Regression E … … t=t-1 t=0 t=t-1 t=0

Superensemble Based on Gauss-Jordan Elimination

Construction of Superensemble Superensemble Forecast : Where, Fiis the ith model forecast , is the mean of the ithforecast over the training period, is the observed mean over the training period, are regression coefficients obtained by a minimization procedure during the training period, and N is the number of forecast models involved. For obtaining the weights, the covariance matrix is built with the seasonal cycle-removed anomaly ( ) , where, t and i, j denote time and ith - ,jth – forecast model, respectively. After calculation of the covariance matrix C, we can construct the weighting component for each grid point of each model. T.N. Krishnamurti et al., 1999, Science

Gauss-Jordan Elimination A·x=B

Superensemble Based on SVD

MME System based on SVD Where, Fiis the ith model forecast, is the mean of the ith forecast over the training period, is the observed mean over the training period, ai are regression coefficients obtained by a minimization procedure during the training period, and N is the number of forecast models involved. For obtaining the weights, the covariance matrix is built with the seasonal cycle-removed anomaly (F’). Where, t and i, j denote time and ith- ,jth– forecast model, respectively. After construction of the covariance matrix C, weights are computed for each grid point of each model. Best Linear Unbiased Estimation (BLUE) This will be the solution-vector of smallest length |x|2 in the least-square sense. x which minimizes r ≡||C·x - b||. SVD realizes a completely orthogonal decomposition for any matrix. W.T.Yun, et al., 2003, J. Climate

SVD realizes a completely orthogonal decomposition for any matrix A This will be the solution-vector of smallest length in the least-square sense. x which minimizes In the case of an underdetermined system, m<n, fewer equations than unknowns, SVD produces a solution whose values are smallest in the least-square sense. In the case of an overdetermined system, m>n, more equations than unknowns, SVD produces a solution that is the best approximation in the least-square sense. The SVD technique removes the singularity problem.

Total Variance Explained Variance Unexplained Variance (%) Error Covariance Matrices (AMIP) (Precipitation: Gauss-Jordan and SVD)

Variables Total Variance r2 (%) Explained Variance of Regression Models Gauss-Jordan SVD w(1) w(1-2) w(1-3) w(1-4) w(1-5) w(1-6) Precipitation 1.2496 85.0723 92.2724 90.4899 89.1822 87.8679 86.5430 85.2433 T850 2.3328 90.3815 97.4425 95.8743 94.4600 93.2157 91.9906 90.5030 u200 19.3385 87.1593 93.6309 92.3347 91.1363 89.8606 88.6386 87.3436 u850 4.2623 90.1600 96.2329 95.0812 93.7804 92.7039 91.5405 90.3010 v200 10.7958 92.5053 98.0942 96.9052 95.9238 94.8459 93.7451 92.6036 v850 2.3297 92.7304 98.6048 97.2439 96.1230 95.0179 93.9193 92.8188 Relative explained variance r2 (%) of regression models using Gauss-Jordan elimination and SVD with zeroing the small singular values. All values are averaged. Relative unexpl. Variance = 1 - r2

SVD Mean RMSE Conventional Superensemble RMSE of MME based on SVD (Global, Precipitation) Simple Ensemble Conventional Superensemble SVD Training Forecast

Zeroing the Small Singular Values (1) The solution vector x obtained by zeroing the small wj’s and then using the equation (1) is better than SVD solution where the small wj’s are left nonzero. It may seem paradoxical that this can be so, since zeroing a singular value corresponds to throwing away one linear combination of the set of equations that we are trying to solve. The resolution of the paradox is that we are throwing away precisely a combination of equations that is corrupted by roundoff error. If we let the small wj’s nonzero, it usually makes the residual larger. We don’t know exactly what threshold to zero the small wj’s is acceptable. The condition number of a matrix is defined as the ratio of the largest (in magnitude) of the wj’s to the smallest of the wj’s. A matrix is singular if its condition number is infinite, and it is ill-conditioned if its condition number is too large.

Singular Values & Variance in the estimate of xj (Precipitation, for one grid point) w(1) w(1-2) w(1-3) w(1-4) w(1-5) w(1-6) w1, 2, 3, 4, 5, 6 J.Climate, Yun et.al (2003) Singular Values

Global Mean RMSE (with Zeroing the Small Singular Values) Global mean precipitation RMSE Global mean T850 RMSE w (1), (1-2), (1-3), (1-4), (1-5), (1-6) G Clim w (1), (1-2), (1-3), (1-4), (1-5), (1-6) G Clim Global mean u200 RMSE Global mean v200 RMSE w (1), (1-2), (1-3), (1-4), (1-5), (1-6) G Clim w (1), (1-2), (1-3), (1-4), (1-5), (1-6) G Clim

High Prediction Skill of Multi-Model Ensemble • Cancellation of bias among different models • Not directly influenced by the model’s systematic errors • Maximization of explained variance • Removes singularity in matrix • Best Linear Unbiased Estimator (BLUE) • Zeroing the small singular values wj

Synthetic Multi Model Ensemble

Synthetic Multi Model Ensemble The MME prediction skill during the forecast phase could be degraded if the training was executed with either a poorer analysis or poorer forecasts. This means that the prediction skills are improved when higher quality training data sets are deployed for the evaluation of the multi model bias statistics.

Synthetic Multi Model Ensemble E(2) – Minimization Synthetic Ensemble Prediction Actual Data Set Synthetic Data Set Superensemble Prediction Schematic chart for the synthetic superensemble prediction system. The synthetic data are generated from the FSU coupled multi-model outputs by minimizing the residual error variance E(2). W.T. Yun, 2004, Tellus accepted

2 - Minimization The residual error variance E(2) is minimized.

Generating Synthetic Data Set N - Actual Data Set E(2) -Minimization Observed Analysis Estimating Consistent Pattern What is matching spatial pattern in forecast data, Fi(x,T), which evolves according to PC time series O(t) of observation data, O(x,t)? N - Synthetic Data Set Actual Data Set (N) Synthetic Data Set (N) Prediction Schematic chart of the multi model synthetic MME prediction. The synthetic data set is generated from the actual data set.

Synthetic MME Prediction Training Phase Forecast Phase N - Synthetic Data Set Observed Analysis Synthetic MME Forecast t=0 The weights are computed at each grid point by minimizing the function: The synthetic data set generated is separated into training and forecast phases. During training phase, optimal weights are computed which are used for producing synthetic MME forecast.

FSU Unified Model Data Set • Atmospheric Global Spectral Model (T63L14)+Hamburg Ocean Model HOPE Starting from 31 December 1986 (to Dec. 2002), every 15 days three months forecasts were made with the four different versions of the coupled model. The multimodels are constructed using two cumulus parameterization schemes (modified Kuo’s scheme following Krishnamurti and Bedi, 1988; and Arakawa-Schubert type scheme following Grell, 1993) and two radiation parameterization schemes (an emissivity-abosrbtivity based radiative transfer algorithm following Chang 1979 and a band model for radiative transfer following Lacis and Hansen 1974) in the atmospheric model only. • KOR – Kuo type convection with Chang radiation computations • KNR – Kuo type convection with Lacis and Hansen radiation computation • AOR – Arakawa Schubert type convection with Chang radiation computations • ANR – Arakawa Schubert type convection with Lacis and Hansen radiation computation

DEMETER Model Data Set • DEMETER (Development of a European Multi-Model Ensemble System for Seasonal to Inter-Annual Prediction) system comprises 7 global coupled ocean-atmosphere models. • CERFACS(European Centre for Research and Advanced Training in Scientific Computation, France),ECMWF(European Centre for Medium-Range Weather Forecasts, International Organization),INGV(Istituto Nazionale de Geofisica e Vulcanologia, Italy), LODYC(Laboratoire d’Océanographie Dynamique et de Climatologie, France), Météo-France (Centre National de Recherches Météorologiques, Météo-France, France),Met Office(The Met Office, UK),MPI (Max-Planck Institut für Meteorologie, Germany) • The DEMETER hindcasts have been started from 1st February, 1st May, 1st August, and 1st November initial conditions. Each hindcast has been integrated for 6 months and comprises an ensemble of 9 members. • The multi-model synthetic ensemble/superensemble is formed by merging the 15 yr (1987-2001) ensemble hindcasts of the seven models, thus comprising 7x9 ensemble members.

Quality of Data Set Actual data set Synthetic data set

ACC & RMS of the DEMETER Multi Model & Synthetic Data Set (Average over 2-4 months Global Precipitation Forecast, JJA) (ECMWF, UKMO, Meteo France, MPI, LODYC, INGV, CERFACS) ACC of Actual Data Set RMS of Actual Data Set 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean RMS of Synthetic Data Set ACC of Synthetic Data Set 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean

ACC & RMS for FSU Unified Model Data Set & Synthetic Data Set (Average over 1-3 months Global Surface Temperature Forecast, JJA; ANR, AOR, KNR, KOR) ACC of Actual Data Set 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean RMS of Actual Data Set 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean RMS of Synthetic Data Set ACC of Synthetic Data Set 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 Mean A : Arakawa Schubert cumulus parameterization K : FSU- modified Kuo cumulus parameterization algorithm. NR :Band model radiation code (New radiation scheme) OR :Emissivity absorbtivity radiation code (old radiation scheme)

ECMWF ECMWF UKMO UKMO Meteo France Meteo France MPI MPI INGV INGV LODYC LODYC CERFACS CERFACS Weights of Actual Data Set Weights of Synthetic Data Set The Global Distribution of Weights for the DEMETER Multi Model & Synthetic Data Set (Average over 2-4 months Global JJA 2001 v-Wind at 850hPa Forecast) (ECMWF, UKMO, Meteo France, MPI, LODYC, INGV, CERFACS)

The Synthetic Seasonal Forecasts

FSU Unified Model Synthetic Ensemble/Superensemble Prediction (Precipitation, 30S-30N, JJA 2001) Obs. EM SEM SSF

Obs. EM SEM SSF DEMETER Multi Model Synthetic Ensemble/Superensemble Prediction (Precipitation, 5N-40N 150W-50W, JJA 2001)

EM Obs. SEM SSF DEMETER Multi Model Synthetic Ensemble/Superensemble Prediction (Surface Temperature, 5N-40N 150W-50W, JJA 2001)

DEMETER Multi Model Synthetic Ensemble/Superensemble Prediction (Wind Speed at 850hPa, India 10SN-35N 50E-110E, JJA 2001) Obs. EM SEM SSF

The Skill Score of Synthetic Forecasts

Skill Score Metrics The Skill Metrics of Forecasts in a Deterministic Sense The AC is a measure of how well the phase of the forecast anomalies corresponds to the observed anomalies. The overbar denotes mean, and the summation can be either in space or in time, depending on whether spatial or temporal anomaly correlation is computed and G is the number of either grid points or time points. The RMSE is a measure of the average magnitude of the forecast error. Despite the fact AC is a good measure of phase error and doesn’t take bias into account, it is possible for a forecast with large errors to still have a good correlation coefficients. So, it is necessary to evaluate the average magnitude of the forecast errors.

Cross Validated ACC for FSU unified Model & synthetic MME JJA-TR 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 DJF-TR 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 Member Model EM Member Model EM SEM SF SSF SEM SF SSF The summer (JJA) and Winter (DJF) precipitation anomaly correlation skill scores for tropical domain (30S-30N). The bars in diagram indicate skill scores of the 4 FSU member models, bias corrected ensemble mean (EM), synthetic ensemble mean (SEM), superensemble (SF), and synthetic superensemble (SSF) from left to right.

ANR, AOR, KNR, KOR FSU EM SEM SSF Cross-validated RMS & ACC for FSU Unified Model & Synthetic Superensemble (30-30N JJA, Average over 1-3 months Precipitation Forecast, ANR, AOR, KNR, KOR) 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Mean 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Mean A : Arakawa Schubert cumulus parameterization K : FSU- modified Kuo cumulus parameterization algorithm. NR :Band model radiation code (New radiation scheme) OR :Emissivity absorbtivity radiation code (old radiation scheme)

ECMWF, UKMO, Meteo France, MPI, LODYC, INGV, CERFACS DEMETER EM SEM SSF Cross-validated RMS & ACC of the DEMETER Multi Model & Synthetic Superensemble (30°S-30°NJJA, Average over 2-4 months Surface Temperature Forecast) 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Mean 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Mean

ECMWF, UKMO, Meteo France, MPI, LODYC, INGV, CERFACS DEMETER EM SEM SSF Cross-validated RMS & ACC of the DEMETER Multi Model & Synthetic Superensemble (30°S-30°N JJA, Average over 2-4 months Precipitation Forecast) 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Mean 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 Mean

MAM JJA SON DJF Statistics of seasonal precipitation categorical forecasts PODy EM 0.55 0.55 0.55 0.56 SEM 0.58 0.58 0.58 0.58 SF 0.55 0.55 0.55 0.55 SSF 0.58 0.58 0.58 0.60 PODn EM 0.60 0.61 0.60 0.61 SEM 0.61 0.60 0.61 0.61 SF 0.60 0.60 0.60 0.61 SSF 0.61 0.59 0.60 0.61 ETS EM 0.11 0.11 0.10 0.11 SEM 0.13 0.13 0.13 0.12 SF 0.10 0.11 0.10 0.10 SSF 0.13 0.12 0.13 0.13 TSS EM 0.16 0.16 0.15 0.16 SEM 0.19 0.18 0.19 0.19 SF 0.14 0.15 0.15 0.16 SSF 0.19 0.17 0.19 0.20 Overall average statistics of seasonal precipitation categorical forecast. Statistics are given for March-April-May (MAM), June-July-August (JJA), September-October-November (SON), and December-January-February (DJF). EM, SEM, SF, and SSF indicate unbiased ensemble mean, synthetic ensemble mean, superensemble based on SVD, and synthetic superensemble forecast, respectively.

Averaged ACC for All Season(FSU unified Model & synthetic MME) Member Model EM SEM SF SSF GL-MAM TR-MAM NH-MAM TR-SON GL-SON NH-SON GL-JJA TR-JJA NH-JJA GL-DJF TR-DJF NH-DJF 16 years (1987-2002) averaged (Fischer Z-Transform) AC precipitation skill scores of all seasons (MAM, JJA, SON, DJF) for global, tropical (30S-30N), and north hemispheric (0-60N) domains. The bars in the diagram indicate the 4 member models, unbiased ensemble mean (EM), synthetic ensemble mean (SEM), superensemble based on SVD (SF), synthetic superensemble (SSF) of FSU model.

Seasonal climate prediction using linear weighted multi model system W. T. Yun APCN/