230 likes | 438 Views
Lecture 22: Autoregression. April 9 th , 2014. Question. Open the Exxon.xls dataset (~gasper/Regression/Data) Just looking at the data, which smoothing method would you initially suggest to forecast Exxon Price? Simple exponentially weighted average
E N D
Lecture 22:Autoregression April 9th, 2014
Question • Open the Exxon.xls dataset (~gasper/Regression/Data) • Just looking at the data, which smoothing method would you initially suggest to forecast Exxon Price? • Simple exponentially weighted average • An exponentially weighted average using Holt’s method • An exponentially weighted average using Winters’ method • I have no idea.
Administrative • Homework due Thursday @ 12pm • Quiz 5 on Monday
Last time • Seasonality • Simple vs Holt’s method, vs Winter’s method for exponential weighted moving average • Polynomial Regression
Regression Forecasting Models Simple regression of trend: • Use time as an independent variable • Seasonality with Regression: • Use dummy variables to represent the seasons (quarters, etc):
Non-linear Regression Polynomial regression: • Uses powers of time as independent variables • Example of a 4th degree polynomial:
Polynomial Regression • Higher orders allow for better fits of the observed data: • Look at the R2 below:
Polynomial Regression • But… Outside of the observed data, very bad things can happen: • In general, avoid fitting models of high order polynomials
Polynomial Regression Sidebar: • You can fit a polynomial regression for non-time-series data. • i.e., you can include things like Income3 or Income4 but we’ve avoided higher order polynomials it in the class. • We’ve done things like Income1/2 or Income-2 • Sometimes it’s completely fine to fit a higher order polynomial regression equation, but ask yourself why. • Realize your regression model is probably “wrong” to some extent anyway. • What we’re often after is a good generalizable model. Don’t make the model overly complicated.
Durbin Watson • Recall the Durbin-Watson statistic • The correlation between adjacent residuals is known as autocorrelation. • Easy to calculate by hand, or you can use StatTools: • StatDurbinWatson(<range_of_residuals>) • The first-order autocorrelation coefficient is denoted r1
Durbin Watson • Since D ≈ 2* (1-r1), then • D ranges between 0 and 4. • Following quick rules of thumb: • If D < 1 positive autocorrelation • Errors are positively correlated. • An increase in the error at time t follows an increase in the previous period • If D = 2 no autocorrelation • If D > 3 negative autocorrelation • Errors are negatively correlated. • A increase in the error at time t follows an decrease in the previous period
Autoregression Autoregression • A regression that uses prior values of the response as a predictor • I.e., we’ll used “lagged” variable. • We’ll refer to this as an AR(1) model, or first-order autoregression • AR because of an autoregressive process • (1) because we’re only including 1 lagged variable as a predictor.
Lagged Variables • Constructing lagged variables is actually quite easy
Example: AR(1) Predict Quarterly Exxon Sales using an AR(1) model. What is your estimate of the β coefficient on the Lagged(price)? • 3.7 • 0.93 • 0.87 • .0.58 • I still have no idea.
Example AR(1) • Using the AR(1) model, what is your forecast of quarterly sales in Q3 1996? • 28125 • 27164 • 28561 • I have no idea.
Durbin-Watson with Autoregression • You can, and should look for autocorrelation with autoregressive models. • BUT the Durbin-Watson is not the right way to do it when you have a lagged dependent variable (autoregressive model).
Forecasting t+kperiods • So far we’ve only focused on predicted Yt+1 • As you move beyond t+1 the uncertainty blows up. • The books says it’s hard to quantify. • Not really true… but it can be hard to write down in a closed form • IMHO the best way to do quantify the uncertainty in future estimates from the models we’ve discussed is with simulation, but that type of simulation is cumbersome to do in Excel. • Other statistical models allow for more precise analytic statements (be rely on distributional assumptions) • For the time being we’ll only focus on one period forecasts: t+1
Back to Autoregression • Recall that an AR(1) process is just • We can also include more than 1 lag. If we include p lags of the dependent variable, then we have an AR(p) model:
Autoregression • How do you determine appropriate order of the AR model? • I.e., how should you choose p? • This is similar to the question of inclusion/exclusion we’ve addressed before. The difference is that it’s related to the amount of autocorrelation in the process. • Look at the autocorrelation plot • Would typically look at the partial autocorrelations as well. • Partial autocorrelation is the amount of autocorrelation between a lag and an observation not explained by the intermediate lags. • Partial autocorrelation between Yt-3 and Yt is the autocorrelation not explained by Yt-1 and Yt-2. • Unfortunately StatTools doesn’t produce partial autocorrelation plots.
Differencing • The first difference is defined as • Taking first differences is very common and a useful transformation for time series analysis. • When there is a large amount of autocorrelation in the residuals, differencing often reduces it. • Differencing can also transform a non-stationary series into a stationary one (although not always). • Stationary: the time series has a constant mean, variance, autocorrelation, etc. over time. Most regression models will assume the process is approximately stationary • Non-stationary: statistical properties of the series change over time.
First Differences • If we have a times series Y we can transform it and look at the differences ΔY. • In particular, we can estimate the Δyt • You can also do lags of differences: AR(1) model of differences:
%-changes • A natural extension of first differences is a percentage change • You basically normalize the change by dividing by the t-1 value. • Stock returns are percentage changes.
Higher order differences As you might guess, differences can be extended beyond 1st differences: and generalizes to Δi.