DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos

DSCI 5340: Predictive Modeling and Business ForecastingSpring 2013 – Dr. Nick Evangelopoulos Lecture 3: Time Series Regression (Ch. 6) Material based on: Bowerman-O’Connell-Koehler, Brooks/Cole

Review of Homework in Textbook • Ex 4.20 Page 210 • Ex 4.21 Page 212 • Ex 5.5 Page 266 • EX 5.10 Page 268

Insurance Innovation Data Ex 4.20 Page 210 • 4.20 part a. Since there are two parallel lines – one for Mutual and one for Stock, a dummy variable can show the difference in the intercepts of the models. • Y = b0 + b1X + b2DS + e

4.20 part bY = b0 + b1X + b2DS + e • ma,m = b0 + b1a + b2(0) for X = a assets, and m= Mutual type of company (D = 0) • ma,s = b0 + b1a + b2(1) for X = a assets, and m= Stock type of company (D = 1) • ma,m - ma,s = b2 • b2 is the difference between the mean of number of months pasted (y) for a mutual type of company and a stock type of company.

Since the p-value for b2 is 3.74187E-05, reject H0: b2 = 0, and conclude that b2 is not equal to 0. • 95% CI for b2 is 4.98 to 11.13, which does not include 0. Type of Firm is a significant variable in predicting Number of months elapsed at both a significance level of 5% and 1%.

Ex 4.20 part d - Interaction • Since the data indicate that the lines for the two type of firms are parallel. • A p-value of .9821 is less than any reasonable alpha level. So the beta coefficient for xDs cannot be assumed to be nonzero.

Ex 4.21 page 212 • Y = b0 + b1DM + b2DT + e • For Bottom Shelf, DM = 0 and DT = 0 which implies: • mB = b0 + b1(0) + b2(0) = b0. • For Middle Shelf, DM = 1 and DT = 0 which implies: • mM = b0 + b1(1) + b2(0) = b0+ b1. For Top Shelf, DM = 0 and DT = 1 which implies: mT = b0 + b1(0) + b2(1) = b0+ b2. bM could be used to represent b1 bT could be used to represent b2

Ex 4.21 part b • 4.21 part b. Note if bM and bT are equal to zero, then: • mB = b0, mM = b0+ bM = b0+ 0 = b0, and mT = b0 + b2 = b0+ 0 = b0. • Thus, H0: bM =0 and bT = 0 implies H0: mB = mM = mT .

Ex 4.21 part c • Since mB = b0, mM = b0+ bM, andmT = b0+ bT, we can solve for b0, bM, andbT . • Therefore, bM= mM - mB, bT = mT - mB, andbM-bT = mM - mT . • Note that t(.025, df=15) = 2.131 (see table on page 593). • 95% CI for bM is 21.4 +/– 2.131*1.433, which is 18.35 to 24.45. • 95% CI for bT is -4.30 +/– 2.131*1.433, which is -7.35 to -1.25.

Ex 4.21 part d page 213 • Note that the Fit is 77.2 which corresponds to the mean of the Middle Shelf sales. Thus the output at the bottom of the Analysis of Variance is for a 95% CI and 95% PI for mean sales when using a middle display height.

Ex 4.21 part e. • Note that in part c, we were not able to get a confidence interval on mM - mTsince it was equal to bM-bT . However, if the following model is used: Y = b0 + b1DB + b2DM + e , then mM - mTis equal to bM since the Top Shelf is now the reference group. • Note that t(.025, df=15) = 2.131 (same as before). • 95% CI for mM - mT (note equal to bM)is 25.7 +/– 2.131*1.433, which is 22.65 to 28.75.

Page 266 Ex 5.5 - outliers

EX 5.10 part a, Page 268 • Y* = b0 + b1X + e, where Y* = ln(Y). • Prediction point for 7 desktop computers and 95% PI for Y* is 5.0206 and 4.3402 to 5.7010. • Prediction point for 7 desktop computers and 95% PI for Y is exp(5.0206) = 151.5 and exp(4.3402) = 76.72 to exp(5.7010) = 299.166. • Note that putting a “.” for Y* in the data with an X = 7 will provide a prediction interval and predicted value for this value in SAS.

EX 5.10 part b, Page 268 • There are a couple of small residuals at -.59979. • It may be possible to remove one of these residuals at a time or to try adding a square term to the model.

Chapter 6 Polynomial Fits • Use higher order terms when curvature exists in graph of y and x. Typically, x is time and square and cubic terms are added to increase the R square. • Interactions can also be formed with higher order terms.

Requirements for Fitting a pth-Order Polynomial Regression Model • 1. The number of levels of x must be greater than or equal to (p + 1). • 2. The sample size n must be greater than (p + 1) to allow sufficient degrees of freedom for estimating F2.

Count the number of times that a curve changes directions. A polynomial fit would have the highest order term be equal to one minus the number of times the curve changes directions. What degree polynomial would you use here?

Extrapolation • The use of a model outside its range is dangerous (although sometimes unavoidable). GNP (y) Inflation Rate (x,%)

Trend and coefficient sign Line Tending Upward: b1 > 0 Curve Tending Downward: b1 < 0

Curvature and coefficient sign Holds Water: b2 > 0 Does Not Hold Water: b2 < 0

Inverse relationship Curve Tending Upward: b1 < 0 Curve Tending Downward: b1 > 0

Exponential curve Curve Tending Upward: b1 < 0 Curve Tending Downward: b1 > 0

S Curve y = exp(b0 + b1(1/x) + e)

Line Tending Upward: b1 > 0Curve Tending Downward: b1 < 0 Logarithmic transformation for Y

Logarithmic transformation for X Curve Tending Upward: b1 > 0 Curve Tending Downward: b1 < 0

Examples of autocorrelation in residuals

Detecting Autocorrelation

Detecting Positive Autocorrelation

Detecting Negative Autocorrelation

Rules of thumb for DW • If DW is close to 2 then there is no autocorrelation. • If DW is close to 0 then there is positive autocorrelation. • If DW is close to 4 then there is negative autocorrelation.

Modeling Seasonal Factor with Dummy Variables

Trigonometric Models Model two is for increasing variation cyclically.

Autoregressive errors Use Proc ARIMA for a First Order Autoregressive Process for the Error Term

Prediction Intervals for Autoregressive Models

Homework in Textbook Page 318 Ex 6.3 Page 318 Ex 6.4

DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos