Regression with ARMA Errors. Example: Seat-belt legislation. Story: In February 1983 seat-belt legislation was introduced in UK in the hope of reducing the number of deaths and serious injuries on the road. Goal: Check whether or not this law is effective
1. Formulation Data: • Yt, t = 1,…,120 : the number of monthly deaths and serious injuries on UK roads for 10 years beginning in January 1975 (SBL.TSM) • ft , t = 1,…,120 : indicator variable showing whether these is this law at time t. ft =0 for 1≤t≤ 98, ft =1 for 99≤t≤ 120 (SBLIN.TSM) Model: Yt = a + b •ft + Wt or Y = Xβ+ W, β=(a,b)T If the estimated value of the coefficient b is significantly negative, the Seat-belt legislation will be considered effective.
2. try OLS regression • Assume Wt ~ WN(0, σ2), we can do OLS regression 1. estimate (a, b) by minimizing the sum of squares: which yields: 2. how well the OLS estimator is: 3. If Wt ~ N(0, σ2), we can calculate the 95% confidence interval of b, therefore we can test whether b is significantly different from zero.
Do it in ITSM: 1.File>Project>Open>Univariate then SBL.TSM 2.click Regression>Specify, • polynomial regression order=1 • auxiliary variable = SBLIN.TSM 3. then click OK and press the GLS button
Results: • ======================================== • ITSM::(Regression estimates) • ======================================== • Method: Generalized Least Squares • Y(t) = L(t) + W(t) • Trend Function: • L(t) = .16211443E+04 t^0 - .29944868E+03 f(t) • ARMA Model: • W(t) = Z(t) • WN Variance = 1.000000 • Coeff Value Std Error • 0 .16211443E+04 .10153462 • 1 -.29944868E+03 .23192141
In R • > summary(lm(SBL ~ SBLIN)) • Call: • lm(formula = SBL ~ SBLIN) • Residuals: • Min 1Q Median 3Q Max • -312.14 -162.39 -68.14 104.97 652.86 • Coefficients: • Estimate Std. Error t value Pr(>|t|) • (Intercept) 1621.14 22.83 71.004 < 2e-16 *** • SBLIN -299.45 52.15 -5.742 7.41e-08 *** • --- • Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 • Residual standard error: 224.9 on 118 degrees of freedom • Multiple R-Squared: 0.2184, Adjusted R-squared: 0.2118 • F-statistic: 32.97 on 1 and 118 DF, p-value: 7.411e-08
3. check residues WtIs the assumption Wt ~ WN(0, σ2) correct? • Check residue plot and ACF/PACF =>seasonal component with period 12
4. Deseasonalizing • Differencing: Yt = a + b • ft + Wt => Mt = Yt – Yt-12 We get : Mt = b • gt + Nt, t=13, …, 120 gt =1 for 99≤t≤110, gt =0 otherwise; Nt = Wt – Wt-12 (In ITSM: Transform > Difference, input 12) • Perform OLS regression of Mt (SBLD.TSM) ongt (SBLDIN.TSM) without intercept term. In ITSM: • Trend Function: L(t) = - .34691667E+03 g(t) • ARMA Model: N(t) = Z(t), WN Variance = 1.000000 • Coeff Value Std Error • 1 -.34691667E+03 .28867513 In R: (summary(lm(SBLD ~ 0+SBLDIN))) • Coefficients: • Estimate Std. Error t value Pr(>|t|) • SBLDIN -346.9 40.6 -8.545 9.76e-14 ***
5. check residues NtIs the assumption Nt ~ WN(0, θ2) correct?
6. Fit ARMA(p,q) model for Nt • The residue looks stationary • ACF/PACF suggest p≤13, q≤13 • => Model selection within 0≤p≤13 and 0≤q≤13 by minimizing AICC To Do: Select Model>Estimation>Autofit to fit AR and MA models with order up to 13 to the residues with no mean-correction
Results : MA(12) • Method: Maximum Likelihood • M(t) = L(t) + N(t), Based on Trend Function: L(t) = - .34691667E+03 g(t) • ARMA Model: • N(t) = Z(t) + .2189 Z(t-1) + .09762 Z(t-2) + .03093 Z(t-3) + .06447 Z(t-4) + .06878 Z(t-5) + .1109 Z(t-6) • + .08120 Z(t-7) + .05650 Z(t-8) + .09192 Z(t-9) - .02828 Z(t-10) + .1826 Z(t-11) - .6267 Z(t-12) • WN Variance = .125967E+05 • MA Coefficients • .218866 .097620 .030935 .064468 • .068780 .110918 .081204 .056495 • .091917 -.028275 .182628 -.626664 • Standard Error of MA Coefficients • .074987 .075880 .076411 .075956 • .076014 .075901 .075901 .076014 • .075956 .076411 .075880 .074987 • (Residual SS)/N = .125967E+05 • AICC = .136720E+04 • AICC = .136984E+04 (Corrected for regression) • BIC = .135676E+04 • -2Log(Likelihood) = .133733E+04 • Accuracy parameter = .100000E-08 • Number of iterations = 1 • Number of function evaluations = 239136 • Uncertain minimum.
7. so Nt is not white noise, we shall improve out previous estimate of b by recursion • Step1. by OLS => fit ARMA(p,q) to Nt, we improved our knowledge about Nt : white noise => MA(12) • Step2. with this new knowledge of Nt, we can go back to improve the estimate of b by GLS using the new Γ=E(NTN). => • Step3. compute new residue Nt using the new estimate of b, fit an ARMA(p,q) like what we did before • Step4. repeat step2 and then step3, until the estimators have stabilized.
To Do: • After fitting ARMA(p,q) model for Nt, the model in the Regression estimates window is automatically updated to: • M(t) = L(t) + N(t) • L(t) = - 0.32844534E+03 g(t) • Press MLE button for a new round of iteration • Finally we arrive at the model: Mt = b • gt + Nt,, b=-328.45, SE(b) = 49.41 N(t) = Z(t) + .2189 Z(t-1) + .09762 Z(t-2) + .03093 Z(t-3) + .06447 Z(t-4) + .06878 Z(t-5) + .1109 Z(t-6) + .08120 Z(t-7) + .05650 Z(t-8) + .09192 Z(t-9) - .02828 Z(t-10) + .1826 Z(t-11) - .6267 Z(t-12) Z(t) ~ WN(0, 12581)
Conclusion Mt = b • gt + Nt,, b=-328.45, SE(b) = 49.41 => b + 1.96 * SE(b) < 0 =>So b is significantly negative => the law has good effect To Do: Regression > Show Fit