480 likes | 495 Views
Chapter 20. Time Series Analysis and Forecasting. Time Series Analysis…. A variable measured over time (in sequential order) is called a time series . From this data, we analyze it to detect patterns that will enable us to forecast future values of the variable.
E N D
Chapter 20 Time Series Analysis and Forecasting
Time Series Analysis… A variable measured over time (in sequential order) is called a time series. From this data, we analyze it to detect patterns that will enable us to forecast future values of the variable. As you might expect, this technique has wide application: u Governments want to know future values of interest rates, unemployment rates and percentage increases in the cost of living. v Housing industry economists must forecast mortgage interest rates, demand for housing, and the cost of building materials. w Many companies attempt to predict the demand for their product and their share of the market. x Universities and colleges often try to forecast the number of students who will be applying for acceptance at post-secondary-school institutions. …and so on!
Time Series Components… A time series can consist of four different components: u Long-term trend v Cyclical variation w Seasonal variation x Random variation variable of interest A trend is a long term relatively smooth pattern or direction, that persists usually for more than one year time
Time Series Components… A time series can consist of four different components: u Long-term trend v Cyclical variation w Seasonal variation x Random variation variable of interest A cycle is a wavelike pattern describing a long term behavior (for more than one year). time Cyclical patterns that are consistent and predictable are quite rare; hence, we will ignore this type of variation
Time Series Components… A time series can consist of four different components: u Long-term trend v Cyclical variation w Seasonal variation x Random variation variable of interest The seasonal component of the time series exhibits a short term (less than one year) calendar repetitive behavior. time
Time Series Components… A time series can consist of four different components: u Long-term trend v Cyclical variation w Seasonal variation x Random variation variable of interest Random variation comprises the irregular unpredictable changes in the time series. It tends to hide the other (more predictable) components. time One of our objectives will be to remove random variation…
Smoothing Techniques… If we can determine which components actually exist in a time series, we can develop better forecasts. We can reduce random variation by smoothing the time series. To methods to smooth the data are: moving averages, and exponential smoothing.
Moving Averages… A moving average for a time period is the arithmetic mean of the values in that time period and those close to it. This is what we hear something like “three year moving average”. Example 20.1: [Xm20-01.xls] and so on…
Example 20.1… COMPUTE Again, calculating manually can be tedious and error prone. In Excel: Data > Data Analysis > Moving Average note how the moving average is “smoother” than the raw data…
Moving Average… INTERPRET The averaging process removes some of the random variation. If we use a 5-quarter moving average, removes even more variation and our line is smoother, but now we’ve lost the seasonality that appears in the 3-quarter moving average.
Centered Moving Average… We’ve considered moving averages for odd numbers of time periods: 3-period: 5-period: What happens when we use even numbers of periods to calculate moving averages?
Centered Moving Average… With an even number of observations included in the moving average, the average is placed between the two periods in the middle. To place the moving average in an actual time period, we need to center it. Two consecutive moving averages are centered by taking their average, and placing it in the middle between them.
Centered Moving Average… Consider this 6-period time series: 15, 27, 20, 14, 25, 11 Can we calculate its four-period moving average?
Exponential Smoothing… There are two drawbacks with the moving average method of smoothing: No moving averages for the first and last sets of time periods. The moving average “forgets” most of the previous time-series values (i.e. only looks at those around it). Exponential smoothing addresses these issues…
Exponentially Smoothed Time Series… An exponentially smoothed time series is one that’s given by St = wyt + (1 – w)St-1 (for t ≥ 2) where: St = Exponentially smoothed time series at time t yt = Time series at time period t St-1 = Exponentially smoothed time series at time t–1 w = Smoothing constant, where 0 ≤ w ≤ 1 In general: (our original data)
Example 20.2… COMPUTE We can calculate these values manually… S1 = y1 St = wyt + (1 – w)St-1
Example 20.2… COMPUTE Excel > Data > Data Analysis > Exponential Smoothing Xm20-02 1 – w
Example 20.2… INTERPRET With w = .7, we have very little smoothing. With w = .2, we have too much smoothing.
Trend and Seasonal Effects… A trend can be linear or nonlinear (or, in fact, take any number of functional forms). The easiest way of measuring the long-term trend is by regression analysis, where the independent variable is time.
Seasonal Analysis… Seasonal variation may occur within a year or within shorter intervals, such as a month, week, or day. To measure the seasonal effect, we compute seasonal indexes, which gauge the degree to which the seasons differ from one another. Employment numbers, for example, are seasonally adjusted to account for summer jobs of students, etc. Was the change in employment numbers due to seasonality or a real change in the economy?
Computing Seasonal Indexes… We can use this procedure to compute seasonal indexes: u Compute the sample regression line: v For each time period, compute the ratio: w For each type of season, compute the average of the ratios from step v x Adjust the averages in w so the average of all seasons = 1
Example 20.3… Compute Calculate the seasonal indexes to account for variations in Bermuda hotel occupancy rates from Xm20-03: Add the independent variable, time… Compute the sample regression line: Regression Analysis… Add a new spreadsheet column…
Example 20.3… Compute For each time period, compute the ratio: Occ. Rate/Y-hat
Example 20.3… Compute For each type of season, compute the average of the ratios from step 2003 2004 2005 2006 2007
Example 20.3… Compute Adjust the averages so the average of all seasons = 1 No adjustments are required since: 2003 2004 2005 2006 2007 Average of all seasons
Example 20.3… COMPUTE Alternatively, you can set up the data in this fashion: [ yt ] [season code] and use the Seasonal Indexes tool from Data Analysis Plus
Example 20.3 INTERPRET The seasonal indexes tell us that, on average, the occupancy rates in the first and fourth quarters are below the annual average, and the occupancy rates in the second and third quarters are above the annual average. E.g., we expect the occupancy rate in the first quarter to be 12.2% (100% - 87.8%) below the annual rate, and 7.6% above for the second quarter, etc.
Time Series and Trend… Here is the time series data and the regression line together:
Deseasonalizing a Time Series… One application of seasonal indexes is to remove the seasonal variation in a time series, by deseasonalizing. The result is called a seasonally adjusted time series. This allows us to more easily compare the time series across seasons… Seasonally Adjusted Time Series = Actual Time Series Season Index 2003 2003 2003 2003 2004 2004
Effects of “Deseasonalization”… Here we’re comparing the original occupancy rate time series data with the seasonally adjusted time series data:
Interpretation… Compared to a horizontal line, we can see that occupancy rates are rising over time…
Introduction to Forecasting… There are many different forecasting models available to us. One way to choose with method or model to use is to select the technique with the greatest forecast accuracy. Two measures of this quantity are: Mean Absolute Deviation (MAD): and Sum of Squares for Forecast Error (SSE): (yt = actual value of time series at time t, Ft = forecasted value, n = number of time periods)
Which to use? SSE? MAD? MAD averages the absolute differences between the actual and forecast values. SSE is the sum of the squared differences. Which measure to use in judging forecast accuracy depends on the circumstances: If avoiding large errors is important SSE should be used because it penalizes large deviations more heavily than does MAD. Otherwise use MAD.
Model Selection… Here is a useful procedure for model selection: u Use some of the observations to develop several competing forecasting models. v Run the models on the rest of the observations. w Calculate the accuracy of each model. x Select the model with the best accuracy measure
Example 20.4… We have developed three forecasting models; which model performed best? E.g. Actuals vs. Forecast #1… 2004 2005 2006 2007
Example 20.4… INTERPRET Model 2 is inferior to both models 1 and 3 – drop it. Using MAD, model 3 is best, but using SSE, model 1 is most accurate. So? Which one to choose?! 2004 2005 2006 2007
Example 20.4… INTERPRET The choice between model 1 and model 3 should be made on the basis of whether we prefer a model that consistently produces moderately accurate forecasts (model 1) or one whose forecasts come quite close to most actual values but miss badly in a small number of time periods (model 3). 2004 2005 2006 2007
Forecasting Models… The choice of a forecasting technique depends on the components identified in the time series. Three techniques will be discussed… Exponential smoothing, Seasonal indexes, and Autoregressive models.
Forecasting with Exponential Smoothing… IF the time series — displays a gradual or no trend, and — no evidence of seasonal variation, THEN exponential smoothing can be effective as a forecasting method. The forecast for period t+k (k=1, 2, 3,…) is given by: Ft+k = St where St is the exponentially smoothed value computed using techniques discussed earlier.
Forecasting with Exponential Smoothing… k=1 Ft+1 = St k=2 Ft+2 = St k=3 Ft+3 = St As you can see, we can produce a reasonably accurate prediction for the next time period (t+1), but the accuracy of the forecast decreases rapidly more than one time period into the future (i.e. t+2, t+3, …)
Forecasting with Seasonal Indexes… IF the time series — is composed of seasonal variation, and — has a long-term trend, THEN we can use seasonal indexes and theregression equation to forecast. The forecast for time period t is: Ft = [b0 + b1t] x SIt
Example 20.5… Forecast hotel occupancy rates for the next year in Example 20.3… We know… the regression line: and the Seasonal Indexes: Put them all together using: Ft = [b0 + b1t] x SIt
Example 20.5… Ft = [b0 + b1t] x SIt Continuing into the next year (through the first to fourth quarter, i.e. time periods 20+1, 20+2, 20+3, and 20+4): That is, we forecast the quarterly occupancy rates to be: .658, .812, .890, and .670
Autoregressive Model… IF the time series — has no obvious trend or seasonality, but — we believe that there is a correlation between consecutive residuals THEN the autoregressive model may be most effective. The autoregressive forecasting model is given by: and is estimated using the regression line:
Example 20.6… The consumer price index (CPI) is a general measure of inflation and is widely used. Consider the annual percent increases in CPI collected over 29 year years and forecast next year’s change in the CPI…
Example 20.6… COMPUTE Remember, we’re trying to correlate the CPI in time period t with the previous time period, t–1, hence we modify the dataset from the list set-up in the first Excel snippet to the set-up in the second (which is how the dataset Xm20-06 is structured). Now we can run the Regression tool…
Example 20.6… COMPUTE Our regression analysis runs… Because the last CPI change is 3.2%, our forecast for 2007 is The autoregressive model forecasts a 2.44% increase in the CPI for the year 2007.