420 likes | 662 Views
. Variables over a long time period usually exhibit a patternRegressing one Time series variable on another often gives high t-vaues even though theoretically there may not be any logical relationship between the two. Eg: GDP and height of a 1 year old child This is the problem of spurious or n
E N D
1. Time Series AnalysisIntroduction Uma Kollamparambil
2. Variables over a long time period usually exhibit a pattern
Regressing one Time series variable on another often gives high t-vaues even though theoretically there may not be any logical relationship between the two.
Eg: GDP and height of a 1 year old child
This is the problem of spurious or nonsense regression (low Rsq and d is lower than Rsq)
Challenge in time series analysis is to identify long run equilibrium variables to enable forecasting
Conditions for valid forecasting
3. A random/stochastic process is a collection of random variable ordered in time.
Forecasting is possible if we are able to identify the data generating process (stochastic process) of a variable.
Stochastic processes maybe stationary or non-stationary
A stationary stochastic process has constant mean and variance over time and value of covariance between the two time periods depends only on the lag between time periods and not actual time at which covariance is computed.
If these conditions are not met, variable is non-stationary time series
4. A special type of stationary stochastic process is “Purely random” or “White noise”: with zero mean, constant variance and no serial correlation.
This is the assumed nature of error term in CLRM
Most time series are non-stationary. Means, variance and autocovariance change over time
OLS requires condition of stationarity and therefore using OLS on non-stationary variables can lead to spurious results.
We first need to establish variables to be stationary or convert non-stationary variables into stationary variables before undertaking OLS for time series
5. Tests of stationarity Graphical method: plot
Autocorrelation function
Covariance at lag k/variance
Consider lag upto one-third of times series length
Near zero ACF for all lags indicates stationarity
Construct confidence interval based on SE calculated and desired confidence level
Box Pierce Q statistic tests joint hypothesis that AC for all lag is simultaneously=0
Ljung-Box stat is a variant with better small sample properties
Unit root test-DF, ADF, PP
6. Dicky-Fuller test
't' value of the coefficient of Yt-1 follows 'tau' statistic
Actual DF test needs to decide between the 3 RWM. Sign of coefficient is to be negative usually
DF test assumes ut is uncorrelated, if correlated “Augmented DF” test to be used
Same hypothesis is tested and same critical values may be used
7. Non-stationary stochastic process A common form of non-stationarity is “random walk model” (RWM)
Pure random walk, Yt=Yt-1+ut
Random walk with drift, Yt=b1+Yt-1+ut
random walk with drift and deterministic trend, Yt=b1+b2t+Yt-1+ut
All three processes are non-stationary but can be converted to statonarity
8. Pure random walk In RWM without drift, value of Y at time 't' is equal to its value in previous time period (t-1) plus a random shock (ut) which is white noise. Yt=Yt-1+ut
Stock prices, exchange rates etc are thot to follow this process. Therefore impossible to predict
Y1=Y0+u1
Y2=Y1+u2=Y0+u1+u2
Y3=Y2+u3=Y0+u1+u2+u3
RWM without drift is therefore non-stationary
Feature: RWM is said to have infinite memory bcos of persistence of random shocks
First difference operator converts RWM without drift to stationary series
9. Non-stationarity in RWM with drift can also be eliminated by taking first difference. Such a series is integrated of order 1 Y~I(I). If variable has to be differenced twice to make it stationary its integrated of order 2 Y~I(2)...
RWM with drift and trend, Yt=b1+b2t+Yt-1+ut, can be converted to stationarity through “trend stationary process”. Regress the variable on time and the residuals from this will be stationary. The residuals is the detrended time series.
A stationary series is integrated of order 0.
27. Cointegration and Error Correction Mechanism Cointegration conveys long run relationship
ECM is used to understand the short-run dynamics
If coeffficient of ut-1 is not significant, Y adjusts to X in the short run. Coefficient of X indicates the Short run impact on Y
The error correction term tells us the speed with which our model returns to equilibrium following an exogenous shock.
It should be negatively signed, indicating a move back towards equilibrium, a positive sign indicates movement away from equilibrium (Explain)
The coefficient should lie between 0 and 1, 0 suggesting no adjustment one time period later, 1 indicates full adjustment
28. Time Series AnalysisForecasting
29. Forecasting using time series ARIMA methodology:univariate/multivariate, single/multiple equation(s)
Vector Auto Regression: Multivariate, multiple equations model
Granger causality
ARCH/GARCH:univariate forecasting of volatility/variance
30. Atheoretic model, past data predicts futureUnivariate single equation models AR
MA
ARMA
31. ARIMA(p,d,q) p indicates the no. of autoregressive lags
d indicates the order of integration of variable
q indicates no. of moving moving average lags
ARIMA(2,0,1)
ARIMA(2,1,2
Box-Jenkins methodology helps us decide p and q.
32. Box Jenkins methodology Identification of p, d and q
Correlogram
Partial correlogram
Estimation
Diagnostic checking
Accept model if residuals are stationary, if not try another model
Forecasting
33. Identification using ACF& PCF The ACF plots the correlations between yt and yt-k against the lag k = 1, 2, 3, …: identifies possible MA terms
The PACF plots the coefficients in a regression of yt on yt-1, yt-2, …yt-k against k = 1, 2, 3, …: identifies possible AR terms
34. Estimation Large time series data set requires
First ensure the series is stationary. If not make it DS or TS.
Estimation can be using OLS in most cases
Regress the stationary series on its past lags and on past lags of error term.
Look at ACF &PACF, Try a few likely models
Select one with lowest AIC, SIC
Interpret R2, t and F as in CLRM
Estimate the residuals necessary for diagnostic testing
35. Diagnostic checking and Forecasting Diagnostic check is to check for stationarity of residuals
Use ACF, PACF
Box Pierce q and Ljung-Box LB statistics
DF and ADF
If no autocorrelation found then, the current model may be used for forecasting.
Forecasting requires undoing of the “differences” taken to obtain stationarity
36. Forecasting To obtain forecast of level, rather than the first difference, we need to integrate the first differenced series
Eg: Forecasting model based on first differenced quarterly GDP data upto 1991.4. Forecast for 1992.1
How will you integrate your model be to forecast for 1992.II?
Using the same model above predict 1992.II
(Data in page 794)
37. Vector Autoregression (VAR) VAR is similar to simultaneous equation model.
It is a system of equations (vector)
However VAR has no exogenous/ predetermined variables
If no predetermined variables, equation is “unidentified” in simultaneous equation model
VAR equation includes lagged values of all the variables in the system
Each equation is estimated independently and OLS may be used
38. 3 variable VAR model Most critical part is to decide the no. of lags
Too many lags leads to low df and multicollinearity
Too few leads to specification errors
AIC/SIC criterion may be used
Since OLS is used interpretation same as CLRM
However bcos of possibility of multicollinearity, t-test should be less relied on. Rather consider F for joint significance of the variables
39. VAR:Impulse response function & ECM IRF traces out the response of the dependent variable in the VAR system to shocks in the error terms, such as u1, u2 and u3.
Logic being that suppose u1 in Y1 equation increases by a value of 1 SD, such a shock will change Y1 in current and future periods. Since Y1 enters the X and R equations, change in u1 affects X and R too.
IRF traces out this impact of such shocks for several periods in the future.
Impulse response functions are responses of all variables in the model to a 1 unit structural shock to 1 variable in the model.
VECM estimates the short run impact just as in standard ECM models.
40. VAR Limitations VAR model is a-theoretic.
Less suited for policy analysis since emphasis is forecasting.
Problem of selecting lag length.
In theory variables should be stationary, but many use levels for interpretation.
Quite often look at impulse response function (IRF).
41. Measuring and Forecasting volatility Finance: Risk measurement
Auto Regressive Conditional Heteroscedasticity ARCH: Variance is not constant. Todays variance is influenced by past variance plus shock term.
GARCH: Generalised Auto Regressive Conditional Heteroscedasticity
Two ways of estimation of ARCH(p):
Time series univariate approach: AR of variance
ARCH(1)
Fundamental theoretic approach: uses time series data but k variable linear regression model
42. ARCH: k variable approach
If coefficients are jointly significant, we have the ARCH effect. There is clustering of volatility.
GARCH(p,q):Variation of ARCH model.Conditional variance of u at time t depends not only on the squared error term in the previous time period (as in ARCH), but also on conditional variance in previous period.