Péter Elek and László Márkus Dept. Probability Theory and Statistics Eötvös Loránd University

Nonlinear time series for modelling river flows and their extremes Péter Elek and László Márkus Dept. Probability Theory and Statistics Eötvös Loránd University Budapest, Hungary

River Tisza and its aquifer

Water discharge at Vásárosnamény(We have 5 more monitoring sites)from1901-2000

Empirical and smoothed seasonal components

Autocorrelation function is slowly decaying

Indicators of long memory • Nonparametric statistics • Rescaled adjusted range or R/S • Classical • Lo’s (test) • Taqqu’s graphical (robust) • Variance plot • Log-periodogram (Geweke-Porter Hudak)

Linear long-memory model : fractional ARIMA-process(Montanari et al., Lago Maggiore, 1997) • Fractional ARIMA-model: • Fitting is done by Whittle-estimator: • based on the empirical and theoretical periodogram • quite robust: consistent and asymptotically normal for linear processes driven by innovatons with finite forth moments (Giraitis and Surgailis, 1990)

Results of fractional ARIMAfit • H=0.846 (standard error: 0.014) • p-value: 0.558 (indicates goodness of fit) • Innovations can be reconstructed using a linear filter (the inverse of the filter above)

Reconstruct the innovation from the fitted model

The density of the innovations

Reconstructed innovations are uncorrelated... But not independent

Simulations using i.i.d. innovations • If we assume that innovations are i.i.d, we can generate synthetic series: • Use resampling to generate synthetic innovations • Apply then the linear filter • Add the sesonal components to get a synthetic streamflow series • But: these series do not approximate well the high quantiles of the original series

But: they fail to catch the densities and underestimate the high quantiles of the original series

Logarithmic linear model • It is quite common to take logarithm in order to get closer to the normal distribution • It is indeed the case here as well • Even the simulated quantiles from a fitted linear model seem to be “almost” acceptable

But the backtransformed quantiles are clearly unrealistic.

Let’s have a closer look at innovations • Innovations can be regarded as shocks to the linear system • Few properties: • Squared and absolute values are autocorrelated • Skewed and peaked marginal distribution • There are periods of high and low variance • All these point to a GARCH-type model • The classical GARCH is far too heavy tailed to our purposes

Simulations: Generate i.i.d. series from the estimated GARCH-residuals Then simulate the GARCH(1,1) process using these residuals Apply the linear filter and add the seasonalities The simulated series are much heavier-tailed than the original series Simulation from the GARCH-process

General form of GARCH-models • Need a model „between” • i.i.d.-driven FARIMA-series and • GARCH(1,1)-driven FARIMA-series • General form of GARCH-models:

A smooth transition GARCH-model

ACF of GARCH-residuals

Results of simulationsat Vásárosnamény

Back to the original GARCH philosophy • The above described GARCH model is somewhat artificial, and hard to find heuristic explanations for it: • why does the conditional variance depend on the innovations of the linear filter? • in the original GARCH-context the variance is dependent on the lagged values of the process itself. • A possible solution: condition the variance on the lagged discharge process instead ! • The fractional integration does not seem to be necessary • almost the same innovations as from an ARMA(3,1) • In extreme value theory long memory in linear models does not make a difference

Estimated variance of innovations plotted against the lagged discharge • Spectacularly linear relationship • This approves the new modelling attempt • Distorted at siteswith damming along Tisza River

The variance is not conditional on the lagged innovation but it is conditional on the lagged water discharge.

Theoretical problems arise in the new model • Existence of stationary solution • Finiteness of all moments • Consistence and asymptotic normality of quasi max-likelihood estimators • Heuristically clearer explanation canbe given • The discharge is indicative of the saturation of the watershed • A saturated watershed results in more straightforward reach for precipitation to the river, hence an increase in the water supply. • A saturated watershed gives away water quicker. • The possible changes are greater and so is the uncertainty for the next discharge value.

An example: Zt~N(0,1) c=20, a1=0.95, 0=1, 1=2, m=1

Existence and moments of the stationary solution • We assume that ct = constant • The model has a unique stationary solution if the corresponding ARMA-model is stationary • i.e. all roots of the characteristic equation lie within the unit circle • Moreover, if the m-th moment of Ztis finite then the same holds for the stationary distribution of Xt, too. • These are in contrast to the usual, quadratic ARCH-type innovations. There the condition for stationarity is more complicated and not all moments of the stationary distribution are finite.

Sketch of the proof I. • The process can be embedded into a (p+q)-dimensional Markov-chain: Yt=AYt-1+Et • where Yt=(Xt-c, Xt-1-c,...Xt-p+1-c, εt, εt-1,..., εt-q+1) and Et=(εt, 0,...). • Yt is aperiodic and irreducible (under some technical conditions). • General condition for geometric ergodicity and hence for existence of a unique stationary distribution (Meyn and Tweedie, 1993):there exists a V1 function with 0<<1, b< and C compact set • E(V(Y1) |Y0=y )  (1-) V(y) + b IC(y) • In other words: V is bounded on a compact set and is a contraction outside it. • Moreover: E(V(Yt)) is finite ( is the stationary distribution).

Sketch of the proof II. • In the given case: • if E(|Zt|m) is finite, • V(y) = 1 + ||QPy||mm will suffice • where: • B=PAP-1 is the real valued block Jordan-decomposition of A • and Q is an appropriately chosen diagonal matrix with positive elements. • This also implies the finiteness of the mth moment of Xt.

Estimation • Estimation of the ARMA-filter can be carried out by least squares. • Essentially only the uncorrelatedness of innovations is needed for consistency. • Additional moment condition is needed for asymptotic normality (e.g. Francq and Zakoian, 1998). • The ARCH-equation is estimated by quasi maximumlikelihood (assuming that Zt is Gaussian), using thet innovations calculated from the ARMA-filter. • The QML estimator of the ARCH-parameters is consistent and asymptotically normal under mild conditions (Zt does not need to be Gaussian).

Estimation of the ARCH-equation in the case of knownt innovations(along the lines of Kristensen and Rahbek, 2005) • Maximising the Gaussian log-likelihood • we obtain the QML-estimator of . • For simplicity we assume that 0 min>0 in the whole parameter space.

Consistency of the estimator • By the ergodic theorem • It is easy to check that L(α)<L(α0)for all αα0where α0denotes the true parameter value. • All other conditions of the usual consistency result for QML (e.g. Pfanzagl, 1969) are satisfied hence the estimator is consistent.

Asymptotic normality I. • Using the notations • for the derivatives of the log-likelihood: • for the information matrix • and for the expected Hessian:

Asymptotic normality II. • A standard Taylor-expansion implies: • Finiteness of the fourth moment with a martingale central limit theorem yields: • Moreover, the asymptotic covariance matrix can be consistently estimated by the empirical counterparts of H and V.

Estimation of the ARCH-equation when t is not known • In this case the innovations of the ARMA-model are calculated using the estimated ARMA-parameters: • If the ARMA-parameter vector is estimated consistently, the mean difference of squared innovations tends to zero: • If the ARMA parameter estimate is asymptotically normal, a stronger statement holds:

Consistency in the case of estimated innovations • Now the following expression is maximised: • But the difference tends to zero (uniformly on the parameter space): • which then yields consistency of the estimate of .

Asymptotic normality in the case of estimated innovations • Under some moment conditions the least squares estimate of the ARMA-parameters is asymptotically normal, hence: • This way the differences between the first and, respectively, the second derivatives both converge to zero: • As a result, all the arguments for asymptotic normality given above remain valid.

Estimation results

How to simulate the residuals of the new GARCH-typemodel • Residuals are highly skewed and peaked. • Simulation: • Use resampling to simulate from the central quantiles of the distribution • Use Generalized Pareto distribution to simulate from upper and lower quantiles • Use periodic monthly densities

The simulation process resampling and GPD Zt smoothed GARCH t FARIMA filter Xt Seasonal filter

Evaluating the model fit • Independence of residual series ACF, extremal clustering • Fit of probability density and high quantiles • Variance – lagged discharge relationship • Extremal index • Level exceedance times • Flood volume distribution

ACF of original and squared innovation series – residual series

Results of new simulationsat Vásárosnamény

Densities and quantiles at all 6 locations

Reestimated (from the fitted model) discharge-variance relationship

The seasonal appearance of the highest values (upper 1%) of the simulated processes follows closely the same for the observed one. Seasonalities of extremes

Extremal index to measure the clustering of high values • Estimated for the observed and simulated processes containing all seasonal components • Estimation by block method: • Value of block length changes from 0.1% to 1%. • Value of threshold ranges from 95% to 99.9%.

Estimated extremal indices displayed

Péter Elek and László Márkus Dept. Probability Theory and Statistics Eötvös Loránd University