1 / 41

DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos

DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos. Lecture 7: Box-Jenkins Models – Part II (Ch. 9). Material based on: Bowerman-O’Connell-Koehler, Brooks/Cole. Homework in Textbook. Page 438 Ex 9.2, Ex 9.3, Ex 9.4. Ex 9.2 Page 438.

natala
Download Presentation

DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DSCI 5340: Predictive Modeling and Business ForecastingSpring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material based on: Bowerman-O’Connell-Koehler, Brooks/Cole

  2. Homework in Textbook Page 438 Ex 9.2, Ex 9.3, Ex 9.4

  3. Ex 9.2 Page 438

  4. Ex 9.3 Page 438 Part a Autocorrelations Dies Down Slowly – Series is Not Stationary

  5. Ex 9.3b Page 438

  6. Ex 9.3c Page 438

  7. Ex 9.3d Page 438 Autocorrelations Cut off Quickly – Series is Stationary

  8. Ex 9.3e Page 438 Interpret SAC & SPAC

  9. Ex 9.3e Page 438 Interpret SAC & SPAC • SAC dies exponentially and • SPAC cuts off after Lag 1, therefore…

  10. Ex 9.4a Page 438 • …the series is AR(1)

  11. Ex 9.4b Page 438

  12. Ex 9.4c Page 439

  13. Ex 9.4d Page 439 part 1 • y3hat = 3.06464 + (.64774 + 1)y2– .64774y1 • y3hat = 3.06464 + (.64774 + 1)*239 -.64774*235 • y3hat = 244.6556 • y3 - y3hat =244.090 - 244.6556 = -.5656

  14. Ex 9.4d Page 439 part 2 • At time origin 90, • Y91hat = 3.06464 + (.64774 + 1)y90– .64774y89 • Y91hat = 3.06464 + (.64774 + 1)*1029.480 - .64774*1018.420 • Y91hat = 1039.708 • Y92hat = 3.06464 + (.64774 + 1)y91hat– .64774y90 • Y92hat = 3.06464 + (.64774 + 1)*1039.708 - .64774*1029.480 • Y92hat = 1049.398 • Y93hat = 3.06464 + (.64774 + 1)y92hat - .64774y91hat • Y93hat = 3.06464 + (.64774 + 1)* 1049.398 - .64774*1039.708 • Y93hat = 1058.739

  15. Ex 9.4d Page 439 part 3

  16. Chapter 9 General Nonseasonal Models

  17. Autoregressive Moving Average Models A time series that is a linear function of p past values plus a linear combination of q past errors is called anautoregressive moving average process of order (p,q), denoted ARMA(p,q). Also, denoted ARIMA(p,0,q)

  18. Box-Jenkins ARIMAX Models • ARIMAX: AutoRegressive Integrated Moving Average with eXogenous variables • AR: Autoregressive  Time series is a function of its own past. • MA: Moving Average  Time series is a function of past shocks (deviations, innovations, errors, and so on). • I: Integrated  Differencing provides stochastic trend and seasonal components, so forecasting requires integration (undifferencing). • X: Exogenous  Time series is influenced by external factors. (These input variables can actually be endogenous or exogenous.)

  19. Formulas for TACs

  20. Formulas for TACs

  21. Determine Whether the SAC or the SPAC is Cutting Off More Abruptly

  22. What if SAC and SPAC Are Not Significant for any Lags? • This could happen if the time series is white noise:

  23. The backshift operator Bk (sometimes Lk is used) shifts a time series by k time units. Shift 1 time unit Shift 2 time units Shift k time units The backshift operator notation is a convenient way to write ARMA models. The Backshift Operator

  24. ACF and PACF after 1-Lag Differencing Indication of MA(1) or MA(2) with sharp cut-off after lag 2 Damping pattern eliminates AR possibility

  25. Autocorrelation Plots for an AR(2) Time Series

  26. Classical Decomposition (Box-Jenkins) Procedure Verify presence of any seasonal or time-based trends Achieve data stationarity using techniques such as “Differencing” where you difference consecutive data points up to N-lag Use sample Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) to see if the data follows Moving Average (MA) or Auto-regressive (AR) process, respectively p – MA order d – Differencing order q – AR order “Goodness of Fit” tests (e.g., Akaike Information Criterion) on the selected model parameters to find model fits that are statistically significant

  27. AIC and SBC/BIC – Information Criteria:Smaller is Better

  28. Stationarity of the AR process • If an AR model is not stationary, this implies that previous values of the error term will have a non-declining effect on the current value of the dependent variable. • This implies that the coefficients on the MA process would not converge to zero as the lag length increases. • For an AR model to be stationary, the coefficients on the corresponding MA process decline with lag length, converging on 0.

  29. Not Stationary

  30. AR Process • The test for stationarity in an AR model (with p lags) is that the roots of the characteristic equation lie outside the unit circle (i.e. > 1), where the characteristic equation is:

  31. Unit Root • When testing for stationarity for any variable, we describe it as testing for a ‘unit root’, this is based on this same idea. • The most basic AR model is the AR(1) model, on which most tests for stationarity are based, such as the Dickey-Fuller test.

  32. Unit Root Test • (L is the backshift operator) • (This is the characteristic equation)

  33. Unit Root Test • With the AR(1) model, the characteristic equation of (1-z)= 0, suggests that it has a root of z = 1. This lies onthe unit circle, rather than outside it, so we conclude that it is non-stationary. • As we increase the lags in the AR model, so the potential number of roots increases, so for 2 lags, we have a quadratic equation producing 2 roots, for the model to be stationary, they both need to lie outside the unit circle.

  34. Model: Null Hypothesis: Alternative Hypothesis: One Example: The Dickey-Fuller Single Mean Test

  35. Mean of an AR(1) Process • The (unconditional mean) for an AR(1) process, with a constant (μ) is given by: • For ϕ1 = 1, the mean drifts to infinity and the process is non-stationary

  36. Variance of an AR(1) Process • The (unconditional) variance for an AR process of order 1 (excluding the constant) is: • For ϕ1 = 1, the variance drifts to infinity and the process is non-stationary

  37. ADF – Augmented Dickey Fuller Test for Unit Root procarima data = TowelSales; identify var = y(1) nlag=15 stationarity = (adf = (2)); title "ARIMA Stationarity Analysis"; run; Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 2 -112.754 0.0001 -6.09 <.0001 Single Mean 2 -112.743 0.0001 -6.07 <.0001 18.40 0.0010 Trend 2 -120.735 0.0001 -6.18 <.0001 19.12 0.0010 Reject unit root – conclude AR(2) is stationary.

  38. Scan Procedure – Use for Preliminary Estimate procarima data = TowelSales; identifyvar = y(1) nlag=15 scan; title "ARIMA Analysis"; run; In this example, ARIMA(2,2) is the simplest model that yields insignificant terms: Model notation:

  39. Tentative Model from Output –MA(1) or ARIMA(0,0,1) • Thesimplest model that has a high probability is MA(1): • AR(1) has a low probability • AR(2) is more complex • ARMA(1,1) is more complex

  40. Forecast Model Building:Fit and Holdout samples Fit Sample Holdout Sample • Used to estimate model parameters for accuracy evaluation • Used to forecast values in holdout sample • Used to evaluate model accuracy • Simulates retrospective study Full = Fit + Holdout data is used to fit deployment model

  41. Homework in Textbook Page 443-445 Ex 9.5, Ex 9.6 Ex 9.7

More Related