430 likes | 603 Views
Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University . “Stationarity”-what is it?. Example: Stocks of Silver in the NY Commodities Exchange Two forecasts: Nonstationary in yellow No mean reversion, unbounded error bands
E N D
Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University
“Stationarity”-what is it? • Example: Stocks of Silver in the NY Commodities Exchange • Two forecasts: • Nonstationary in yellow • No mean reversion, unbounded error bands • Stationary in green • Reverts to mean, bounded error bands
“Stationarity”-what is it? • Constant mean m • Covariance between Yt, Yt+h function of h only. g(h) • [Autocorrelation r(h) = g(h)/g(0)]
One Lag Model • Yt-m=r(Yt-1-m)+et • “shocks” et~N(0,s2) • Stationary: |r|<1 • Yt=m(1-r) +rYt-1+et • Regress Yt on 1, Yt-1 • Estimators approximately normally distributed in large samples • Use t test for H0:r=0
One Lag Model with r=1 • Yt-m=1(Yt-1-m)+et • “shocks” et~N(0,s2) • Yt=Yt-1+et • Best forecast of Yt is Yt-1 • Nonstationary: r=1 • Regress Yt on 1, Yt-1 • Estimators NOT normally distributed even in large samples • CANNOT use t tables to test for H0:r=0 • t test statistic does NOT have t distribution!!!
Hypothesis Test • Model: Yt-m=r(Yt-1-m)+et • Test • H0: r=1 “Nonstationary, Unit Root” • H1: |r|<1 “Stationary (mean reverting) • Compare t calculated to new distribution
Two Tests • Model: Yt-m=r(Yt-1-m)+et • Yt-m-(Yt-1-m)=(r-1)(Yt-1-m)+et • Yt-Yt-1= m (1-r)+ (r-1)Yt-1+et • Regress Yt-Yt-1 on 1, Yt-1 • Tests: • n(coefficient of Yt-1) “Rho” • calculated t test “Tau”
Some math Above diagonal ->
More math W(t) is Wiener Process on [0,1]
Two Series SAS software: PROC ARIMA procgplot; plot (Y Z)*t / overlay; procarima; i var=Y nlag=10 stationarity=(adf); i var=Z nlag=10 stationarity=(adf);
Symptoms of Nonstationarity • ACF dies down slowly • ACF is Corr (Yt, Yt-j) plot vs. j • Nonconstant level when plotted • Saw plot, ACFs coming up
Y series ACF The ARIMA Procedure Name of Variable = Y Mean of Working Series 110.9728 Standard Deviation 5.286108 Number of Observations 250 Autocorrelation Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 Std Error 0 1.00000 | |********************| 0 1 0.97219 | . |******************* | 0.063246 2 0.94506 | . |******************* | 0.107523 3 0.91741 | . |****************** | 0.136771 4 0.89025 | . |****************** | 0.159498 5 0.86479 | . |***************** | 0.178269 6 0.84145 | . |***************** | 0.194326 7 0.81771 | . |**************** | 0.208391 8 0.79836 | . |**************** | 0.220853 9 0.77912 | . |**************** | 0.232110 10 0.75671 | . |*************** | 0.242346
Z series ACF The ARIMA Procedure Name of Variable = Z Mean of Working Series 100.5022 Standard Deviation 2.402392 Number of Observations 250 Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 1.00000 | |********************| 1 0.90796 | . |****************** | 2 0.81755 | . |**************** | 3 0.72228 | . |************** | 4 0.63703 | . |************* | 5 0.56707 | . |*********** | 6 0.51964 | . |********** | 7 0.47865 | . |********** | 8 0.46026 | . |********* | 9 0.44466 | . |********* | 10 0.42313 | . |******** | "." marks two standard errors
Tests on Y The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 0 0.1014 0.7059 0.71 0.8675 1 0.0880 0.7027 0.59 0.8422 2 0.0719 0.6989 0.45 0.8101 Single Mean 0 -6.8507 0.2817 -2.30 0.1724 2.99 0.3095 1 -6.8539 0.2815 -2.16 0.2211 2.57 0.4147 2 -7.1478 0.2624 -2.07 0.2564 2.29 0.4861 Trend 0 -7.3468 0.6313 -2.46 0.3502 3.64 0.4500 1 -7.3273 0.6328 -2.30 0.4295 3.07 0.5636 2 -7.5909 0.6114 -2.19 0.4905 2.65 0.6489 Nonstationary
Tests on Z The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 0 -0.0087 0.6803 -0.05 0.6647 1 -0.0237 0.6769 -0.15 0.632 2 -0.0393 0.6733 -0.24 0.5997 Single Mean 0 -22.8511 0.0051 -3.45 0.0104 5.96 0.0136 1 -24.5443 0.0034 -3.48 0.0095 6.06 0.0114 2 -28.8542 0.0015 -3.69 0.0050 6.80 0.0010 Trend 0 -24.6119 0.0236 -3.61 0.0312 6.53 0.0449 1 -26.2971 0.0161 -3.60 0.0319 6.48 0.0461 2 -30.7682 0.0057 -3.77 0.0196 7.13 0.0283 Stationary
Higher Order Processes • Yt-m=a1(Yt-1-m) + a2(Yt-2-m) + a3(Yt-3-m) + et • DYt= Yt-Yt-1 = • -(1-a1- a2 - a3) (Yt-1-m) - (a2 + a3) DYt-1 - a3 DYt-2 + et • [ coefficient ] Augmenting lags ADF stands for Augmented Dickey-Fuller Testing for no mean reversion: H0: (1-a1- a2 - a3) = 0 • Regress Yt-Yt-1 on 1, Yt-1,Yt-1-Yt-2, Yt-2-Yt-3 • Nonstandard | N(__, __) |
Higher Order Processes Q1: How many lags??? Regress DYt on 1,Yt-1, DYt-1 ,DYt-2, . . . | N(__, __) | so . . . Just use usual t tests and p-values!!! Q2: Why “Unit Root” Tests ?? B(Yt)= Yt-1 (1-a1B - a2B2- a3B3)(Yt-m)= et root of 1-a1B - a2B2- a3B3 at B=1 means 1-a11 - a212- a313 = 0
Check Silver Series for Augmenting Lags PROC REG; MODEL DEL= LSILVERDEL1 DEL2 DEL3 DEL4; TEST DEL2=0, DEL3=0, DEL4=0; Mean Source DF Square F Value Pr > F Numerator 3 4589.63459 1.31 0.2753 Denominator 133 3515.48242
Unit Root test in PROC REG PROC REG; MODEL DEL= LSILVERDEL1; Parameter Variable DF Estimate t Value Pr > |t| Intercept 1 75.58073 2.76 0.0082 LSILVER 1 -0.11703 -2.780.0079 DEL1 1 0.67115 6.21 <.0001
Unit Root test in PROC ARIMA PROC ARIMA DATA=SILVER; I VAR=SILVER STATIONARITY=(ADF=(1)); Augmented Dickey-Fuller Unit Root Tests Type Lags Tau Pr < Tau Zero Mean 1 -0.28 0.5800 Single Mean 1 -2.780.0689 Trend 1 -2.63 0.2697
Type Lags Tau Pr < Tau Zero Mean ????? (A) Single Mean 1 -2.780.0689 Trend ????? (B) (A) Assumes mean is 0 (or known and subtracted off) Has different (pair of) distributions !! (B) Allows for TREND under H1 Has third (pair of) distributions !!!!
Silver - Need 2nd Difference? Dt = DYt = Yt-Yt-1 Q: Does D (also) have a unit root ?
Regress DDt on Dt-1 using /NOINT (why?) No augmenting lags (why?) I VAR=Y(1) STATIONARITY = . . . Type Lags Tau Pr < Tau Zero Mean 0 -3.42 0.0010 Single Mean 0 -3.39 0.0158 Trend 0 -3.62 0.0383
Autocorrelations • Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 • 0 7612550 1.00000 | |********************| • 1 7604217 0.99891 | .|********************| • 2 7595529 0.99776 | .|********************| • 3 7586855 0.99662 | . |********************| • 4 7578152 0.99548 | . |********************| • 5 7569481 0.99434 | . |********************| • 6 7560553 0.99317 | . |********************| • 7 7551925 0.99204 | . |********************| • 8 7543869 0.99098 | . |********************| • 9 7535957 0.98994 | . |********************| • 10 7528240 0.98892 | . |********************| • 11 7519890 0.98783 | . |********************| • 12 7511672 0.98675 | . |********************| • "." marks two standard errors
Output from SAS PROC ARIMA • Augmented Dickey-Fuller Unit Root Tests • Type Lags Rho Pr < Rho • Zero Mean 0 1.3567 0.9565 • 1 1.3481 0.9557 • Single Mean 0 0.4065 0.9744 • 1 0.3500 0.9725 • Trend 0 -6.3073 0.7203 • 1 -6.5833 0.6981
Autocorrelations • Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 • 0 4003.285 1.00000 | |********************| • 1 102.471 0.02560 | .|* | • 2 -117.368 -.02932 | *|. | • 3 -235.578 -.05885 | *|. | • 4 -26.946567 -.00673 | .|. | • 5 -46.750761 -.01168 | .|. | • 6 -77.100469 -.01926 | .|. | • 7 -224.055 -.05597 | *|. | • 8 -27.874814 -.00696 | .|. | • 9 132.415 0.03308 | .|* | • 10 316.534 0.07907 | .|** | • 11 -254.117 -.06348 | *|. | • 12 200.979 0.05020 | .|* | • "." marks two standard errors
Inverse Autocorrelation • Ming Chang thesis • Dual model (1-a B) Yt= et dual is Yt = (1-a B) et AR(1) MA(1) • Chang shows IACF dies off slowly if you overdifference.
Inverse Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 -0.51119 | **********|. | 2 0.01380 | .|. | 3 -0.00533 | .|. | 4 0.01061 | .|. | 5 -0.02324 | .|. | 6 0.00722 | .|. | 7 0.02122 | .|. | 8 -0.01617 | .|. | 9 0.02831 | .|* | 10 -0.04860 | *|. | 11 0.02759 | .|* | 12 -0.00422 | .|. | Differenced DJIA IACF
2nd Differenced DJIA IACF Just for illustration, here is the inverse autocorrelation you would get if you differenced these differences once more, that is, if you took the second difference of the original series. Note the roughly triangular appearance, suggesting that you should have stopped after the first difference Inverse Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 0.89720 | .|****************** | 2 0.80302 | .|**************** | 3 0.70785 | .|************** | 4 0.60466 | .|************ | 5 0.50498 | .|********** | 6 0.41173 | .|******** | 7 0.32523 | .|******* | 8 0.23836 | .|***** | 9 0.15871 | .|*** | 10 0.09447 | .|** | 11 0.05758 | .|* | 12 0.01735 | .|. |
Rho and F Yt-m=a1(Yt-1-m) + a2(Yt-2-m) + et Factor: (1-a1B-a2B2) = (1-rB)(1-gB) • DYt = - (1-r)(1-g)(Yt-1-m) + rg(DYt-1) + et Rho (1) Estimate rg = -a2 = ( H0)gby regression (2) Divide n[(1-r)(g-1) estimate] by (g estimate-1) F Regress DYt on 1, t, Yt-1 , DYt-1 Test underlined items with F (3 numerator df)
Trend is not Unit Root Yt = a + b t + Zt with Zt stationary Yt-1 = a + b(t-1) + Zt-1 DYt = b + DZt with DZt an overdifferenced series !! Example:
PROC REG; MODEL DV = DATE LAGV DV1-DV4; TEST DV3=0, DV4=0; Parameter Variable DF Estimate t Value Pr > |t| Type I SS Intercept 1 -17.49220 -5.26 <.0001 0.00848 date 1 0.00147 5.41 <.0001 0.01395 LAGV 1 -0.21914 -5.80 <.0001 26.67803 DV1 1 -0.15446 -3.08 0.0022 0.94211 DV2 1 -0.18447 -3.72 0.0002 3.52898 DV3 1 -0.04433 -0.94 0.3477 0.07997 DV4 1 -0.05774 -1.31 0.1923 0.48763 Test 1 Results for Dependent Variable DV Mean Source DF Square F Value Pr > F Numerator 2 0.28380 0.99 0.3715 Denominator 497 0.28602
ACF Levels: Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 2.503910 1.00000 | |********************| 1 2.327538 0.92956 | . |******************* | 2 2.225324 0.88874 | . |****************** | 3 2.193509 0.87603 | . |****************** | 4 2.155492 0.86085 | . |***************** | 5 2.127643 0.84973 | . |***************** | 6 2.099292 0.83841 | . |***************** | 7 2.069929 0.82668 | . |***************** | 8 2.062194 0.82359 | . |**************** | 9 2.051450 0.81930 | . |**************** | 10 2.011864 0.80349 | . |**************** | 11 2.006564 0.80137 | . |**************** | 12 1.996735 0.79745 | . |**************** | 13 1.960231 0.78287 | . |**************** | 14 1.951272 0.77929 | . |**************** | 15 1.940939 0.77516 | . |**************** | 16 1.919167 0.76647 | . |*************** | 17 1.906896 0.76157 | . |*************** | 18 1.905406 0.76097 | . |*************** | 19 1.892168 0.75569 | . |*************** | 20 1.857199 0.74172 | . |*************** | 21 1.846038 0.73726 | . |*************** | 22 1.826167 0.72933 | . |*************** | 23 1.816151 0.72533 | . |*************** | 24 1.821228 0.72735 | . |*************** | "." marks two standard errors
IACF - Differences Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 0.48216 | . |********** | 2 0.44816 | . |********* | 3 0.34266 | . |******* | 4 0.30682 | . |****** | 5 0.25213 | . |***** | 6 0.24854 | . |***** | 7 0.23624 | . |***** | 8 0.18675 | . |**** | 9 0.14088 | . |*** | 10 0.20330 | . |**** | 11 0.13295 | . |*** | 12 0.11437 | . |** | 13 0.15524 | . |*** | 14 0.11829 | . |** | 15 0.09978 | . |** | 16 0.10919 | . |** | 17 0.09049 | . |** | 18 0.06653 | . |*. | 19 0.02886 | . |*. | 20 0.09515 | . |** | 21 0.05504 | . |*. | 22 0.07104 | . |*. | 23 0.06065 | . |*. | 24 0.02284 | . | .
The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 2 0.0144 0.6861 0.02 0.6909 Single Mean 2 -14.2100 0.0474 -2.60 0.0944 3.42 0.1920 Trend 2 -85.7758 0.0007 -6.35 <.0001 20.18 0.0010 Do the test: Fit AR(3) plus trend. Diagnostics: Autocorrelation Check of Residuals To Chi- Pr > Lag Square DF ChiSq -----Autocorrelations----- 6 1.59 3 0.6615 -0.015 . . . -0.000 12 10.89 9 0.2835 -0.025 . . . 0.072 18 12.43 15 0.6460 -0.036 . . . 0.031 24 18.97 21 0.5872 30 23.75 27 0.6439 36 30.32 33 0.6014 42 37.56 39 0.5358 48 39.37 45 0.7087
Extensions S. E. Said shows that models with lagged et terms can still be tested by ADF tests. Nobel Prize “cointegration” idea: Two or more unit root processes have stationary linear combination. Compute, e.g. Yt = ln(St/Lt) and test for stationarity. • http://www4.stat.ncsu.edu/~dickey • Click: SAS Code from Presentations
Thanks ! Questions ?