390 likes | 499 Views
EC 827. Module 2 Forecasting a Single Variable from its own History. Regression Refresher (oh what fun?). Consider a simple regression, e.g. one based on the incomes of workers at a firm: …for which a computer will calculate an optimal mathematical model:
E N D
EC 827 Module 2 Forecasting a Single Variable from its own History
Regression Refresher (oh what fun?) • Consider a simple regression, e.g. one based on the incomes of workers at a firm: • …for which a computer will calculate an optimal mathematical model: • where the b’s are coefficients on education, age etc… and the e is a random error.
Coefficients • Let’s say that b2 (the coefficient on age) is 125. If we assume that the data for age is just some number of years (e.g. 32 years of age etc…), then the coefficient implies that, on average, a worker who is one year older will earn $125 more than a younger equivalent, all things being equal. • But… • is the relationship real? • does it matter at all?
Statistical Significance • To tell whether the coefficient should be believed, the computer will generally give us some measure of the reliability of the coefficient estimate, normally either: • the standard error • the t-stat • the p-value • Fortunately, they all tell the same story...
Statistical Significance II • …they all measure, given the natural variability of the data, how likely it is that the actual (as opposed to estimated) coefficient is really zero (and hence, there’s no real relationship between, e.g. age and income). • Generally, we use a 95% confidence level as our measure of “certainty”. If a coefficient passes one of three equivalent tests, we are reasonably confident there is a relationship:
Coefficient Tests I • If the coefficient is approximately twice the size of the standard error or • The t-stat is greater (in absolute value) than 1.96 or • The p-value is less (in absolute value) than 0.05 • …then we’re confident we’ve found a real relationship. Note: saying we’re confident there’s a relationship isn’t the same as saying we’re confident that our relationship is accurate.
Coefficient Tests II • As a general rule, if a coefficient passes one of these tests, we think it’s important as an explanation of the dependent variable (in our example, income). If it fails the test, we may want to discard it. • We can also test whole sets of variables by using F-tests. Again, the computer will calculate these scores for you. Their meaning is the same: they measure whether or not it appears the variables are important determinants of whatever we’re modeling.
Economic Significance • Be careful: all these tests just look for a mathematically meaningful relationship, not a practically important one. • For example, it would be possible that the coefficient on age was 1.25, not 125, but was very statistically significant. That would imply that an extra year of age raised average salaries by $1.25 per year. Statistically measurable? Yes. Relevant or important? No.
Time Series Data • Data that we are interested in forecasting are time series: observations that have a well defined temporal ordering • Notation: t-1Xt = forecast constructed at time = t-1 of the observation on X to be realized at time = t. No standard notation • Alternative data have no particular ordering: e.g. sample of heights of people in this class
Forecasting a Single Variable I • Any time series can be considered as the sum of two components: • Deterministic Component: that part of the time series for which a perfect forecast of the future value can be constructed • examples: constant, time trends, constant seasonal factors • Stochastic Component: that part of the time series for which is random (stochastic) for which predictions of future value may turn out to be in error.
Forecasting a Single Variable II • Assumption is that a history of the variable is available - a time series of observations • Require that information in that history have implication for current or future realizations of the variable. • Useful information available at the present for forecasting the future requires correlation between events at different points in time.
Correlation and Causation • Correlation simply implies a link between two data series, not that the link is “cause and effect”. • In a perfect world, we’d like to find the effects of causes • In the real world, the best we can hope for is to find potential causes of effects
Forecasts from Own History:An Example • Consider a coin tossing experiment: • Toss a coin N times and record heads (1) or tails (0) for each replication • Generate a time series: 0,1,1,0, ... • No deterministic component to this time series • Does the information in the time series of outcomes provide any basis for forecasting the outcome of additional tosses? Why or why not?
Covariance Stationarity • Characteristic of the Stochastic Component of a time series • mean does not depend on time • variance does not depend on time and is finite • autocovariances (autocorrelations) depend only on the distance between observations and not the time of the observations.
Sample Correlation Coefficients • For two series, Xi and Yi on which N observations are available: • correlation coefficient =
Autocorrelations: Definitions • Definition: An autocorrelation coefficient is the correlation between observations of a time series that are separated by a fixed time interval. • A First order autocorrelation is the correlation between observations in a time series and the same observations lagged one period. • A pth order autocorrelation is the correlation between observations in a time series and the same observations lagged p periods.
Coin Tossing Experiment 1.00 0.75 0.50 0.25 0.00 -0.25 -0.50 -0.75 -1.00 1 3 5 7 9 Lag Sample Autocorrelation Function I
Predictions of Outcomes of Future Coin Tosses • Outcome of any particular coin toss (head or tail) is not influenced by the outcome of any past toss (assuming a fair coin) • Coin tosses are independent events • Autocorrelations for time series of coin toss outcomes are zero (estimated autocorrelations are not significantly different from zero) • No useful forecasting info in time series
Sample Partial Autocorrelation Coefficients • Construct a linear regression of a variable on a constant and lagged observations on the dependent variable up to order p. • Estimated coefficients in this regression model are the estimated partial autocorrelation coefficients, i.e. the coefficient on the n’th lag would be correlation between the event n periods ago and today’s event, given all of the other events.
White Noise Variables • white noise process (stochastic variable) = • zero mean • constant finite variance • not serially correlated (uncorrelated with observations at different points in time) • autocorrelation coefficients of order 1 are all zero. • Coin tossing is such a white noise process (actually stronger = independent white noise)
Wold Representation Theorem • Any zero mean covariance stationary process Xt can be written as an infinite sum of white noise processes:
Wold Theorem: Implementation • What does that mean? It means that everything that happens is a function of an infinite series of all past random events. True, but… so what? • The problem is to estimate the terms • impossible since there are an infinite number • trick is to find some model that approximates the Wold representation
Forecasting Without Infinite Information • What will tomorrow look like? Generally, tomorrow will look like today. • What’s more important: • How much will tomorrow look like today? • How will tomorrow respond to today’s shocks? • How will tomorrow be different from today? • First, know what question to ask… then worry about answers.
Autocorrelation Patterns I • Autoregressive (AR) patterns: “Today looks like previous days” • at least three components • deterministic component (e.g. constant, trend, constant seasonal factors) • second component depends on observed values of previous periods • third component is a new shock, independent of anything that has happened in the past
AR(1) Variable; a1 = .8 1.00 0.75 0.50 0.25 0.00 -0.25 -0.50 -0.75 -1.00 1 3 5 7 9 Lag Sample Autocorrelation Function II
Sample Autocorrelation II • Note that in AR(1) process all observations prior to time = t are correlated with the outcome at t. • all previous observations have information that is useful in forecasting what will happen at t. • usefulness (size of autocorrelation coefficients) decreases as information becomes older (move to more distant past) • This is the general result for AR processes, although the autocorrelation patterns can differ.
Autocorrelation Patterns II • Moving Average (MA) Process: “Today is determined by yesterday’s shocks” • at least three components • the deterministic component of the series • the effect of the current shock on the series • the effect of one or more previous shocks whose influence still persists in the current observation.
Ma1 Variable c1 = .8 1.00 0.75 0.50 0.25 0.00 -0.25 The shock occurs at t=1, with full strength, then persists with strength c1*et-1 = (.8*1) = .8 at t=2, then vanishes -0.50 -0.75 -1.00 1 3 5 7 9 Lag Sample Autocorrelation Function III
Sample Autocorrelations III • Note that in MA process only a limited number of past observations have information (autocorrelations) that are useful in forecasting the current outcome. • number of past observations that have potentially useful forecasting information is determined by the length of the MA process (here only one past observation)
Sample Autocorrelations IV • Size of first order autocorrelation of MA(1) process determined by value of c1 (see Diebold, p. 158) • (don’t worry about remembering this. That’s what PCs are for)
Forecasts from AR Models I • How does anything that occurs today (time = t) carry forward to influence future observations? • Assume shock et = 1.0 • AR(1) coefficient in (e.g.) Annual Inflation Model is 0.58 • Shock of 1.0 at time = t carries forward to generate an increase in Xt+1 of 0.58 • Shock of 0.58 to Xt+1 carries forward to generate and increase in Xt+2 of0.58*0.58 = 0.34
Forecasts from AR Models II • Effect of shock at time = t persists to affect observations at t+3, t+4, etc. • Size of the effect on future observations becomes smaller as long as absolute value of autoregressive coefficient is < 1.0 • Contrast: • Shocks in AR model carry forward to affect future observations indefinitely; • Shocks in MA model carry forward limited number of periods
Forecasts from AR Models III • Effect on future observations at time t+j of a shock at a particular time t, is measured by impulse response function
AR(1) Impulse Response Function 1.00 0.75 0.50 0.25 0.00 0 2 4 6 8 10 12 Impulse Response Function Annual Inflation AR(1) Model
Forecasts from MA Models • How does anything that occurs today (time = t) carry forward to influence future observations? • Assume that we see a shock et = 1.0. • What are implications for t+1, t+2, etc?
Transitory & Permanent Shocks • A transitory shock is one whose effect eventually dies off • go far enough out into the future and events at that time are not influenced by what is happening today. • A permanent shock is one whose effect continues to influence events, no matter how far off into the future. • all past events have a lasting effect on the present and the future