1 / 14

FPP Chapter 2 (Cont.)

FPP Chapter 2 (Cont.). Measuring Forecast Accuracy Analyzing Residuals Prediction Intervals. Measuring Forecast Accuracy. How to measure the suitability of a particular forecasting method for a given data set?

lyris
Download Presentation

FPP Chapter 2 (Cont.)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FPP Chapter 2 (Cont.) Measuring Forecast Accuracy Analyzing Residuals Prediction Intervals

  2. Measuring Forecast Accuracy • How to measure the suitability of a particular forecasting method for a given data set? • In most forecasting situations, accuracy is treated as the overriding criterion for selecting a forecasting method. • In many instances, the word “accuracy” refers to the “goodness of fit,” which in turn refers to how well the forecasting model is able to reproduce the data that are already known. • To the consumer of forecasts, it is the accuracy of the future forecast that is most important.

  3. Measuring Forecast Accuracy Consider the Australian quarterly beer production figures. We’re going to produce forecasts based upon data through the end of 2005. beer2 <- window(ausbeer,start=1992,end=2006-.1)beerfit1 <- meanf(beer2,h=11)beerfit2 <- rwf(beer2,h=11)beerfit3 <- snaive(beer2,h=11)plot(beerfit1, plot.conf=FALSE, main="Forecasts for quarterly beer production")lines(beerfit2$mean,col=2)lines(beerfit3$mean,col=3)lines(ausbeer)legend("topright", lty=1, col=c(4,2,3), legend=c("Mean method","Naivemethod","Seasonal naive method"))

  4. Measuring Forecast Accuracy • Let yi denote the ith observation and let denote a forecast of yi. • Scale-dependent errors • The forecast error is simply , which is on the same scale as the data. • Accuracy measures that are based on ei are therefore scale-dependent and therefore cannot be used to make comparisons between series that are on different scales.

  5. Measuring Forecast Accuracy • The two most commonly used scale-dependent measures are based on the absolute errors or squared errors • Mean Error (ME) = mean of errors • Mean Absolute Error (MAE) = mean of absolute value of errors • Root Mean Square Error (RMSE) = square root of mean of squared errors

  6. Measuring Forecast Accuracy • Percentage errors • The percentage error is given by pi=100ei/yi. Percentage errors have the advantage of being scale-independent, and so are frequently used to compare forecast performance between different data sets. • The most commonly used measure is mean absolute percentage error (MAPE), which is the mean of the absolute value of the percentage errors.

  7. Measuring Forecast Accuracy • Scaled errors • As an alternative to percentage errors, errors are scaled based on the training MAE versus a simple forecasting method (usually a naïve forecast for a time series). • A scaled error is less that one if it arises from a better forecast than the average naïve forecast computed on the training data.

  8. Measuring Forecast Accuracy beer3 <- window(ausbeer, start=2006) accuracy(beerfit1, beer3)accuracy(beerfit2, beer3)accuracy(beerfit3, beer3)

  9. Out-of-sample Accuracy Measurement • The summary statistics described thus far measure the goodness of fit of the model to historical data. Such fitting does not necessarily imply good forecasting. • As in the beer example, above, it is prudent to divide the total data into an initialization/training set and a test/holdout set. • The initialization set is used to estimate any parameters and to initialize the method • Forecasts are made for the test set, and accuracy measures are computed for the errors in the test set only.

  10. Residual Diagnostics • A residual is the difference between an observed value and its forecast based on other observations. • For time series, a residual is based on one-step-ahead forecasts. That is, the forecast of yt is based on yt-1,…,y1. • For cross-sectional forecasts, the residual is calculated based on forecasts using all observations other than the one being examined.

  11. Residual Diagnostics A good forecasting model will yield residuals with the following properties: • The residuals are uncorrelated. If there are correlations between residuals, then there is information left in the residuals which should be used in computing forecasts. • The residuals have zero mean. If the residuals have a mean other than zero, then the forecasts are biased. If either of these two properties is not satisfied, then the forecasting method can be modified to give better forecasts. Adjusting for bias is easy: if the residuals have mean m, then simply add m to all forecasts and the bias problem is solved. Fixing the correlation problem is harder.

  12. Residual Diagnostics In addition to these essential properties, it is useful (but not necessary) for the residuals to also have the following two properties. • The residuals have constant variance. • The residuals are normally distributed. These two properties make the calculation of prediction intervals easier.

  13. Residual Diagnostics Example – Forecasting the DJIA: When forecasting equity indices, the best forecast is often the naïve one. In this case, the residual is simply the difference between consecutive observations. dj2 <- window(dj, end=250) plot(dj2, main="Dow Jones Index (daily ending 15 Jul 94)", ylab="", xlab="Day") res <- residuals(naive(dj2)) plot(res, main="Residuals from naive method", ylab="", xlab="Day") Acf(res, main="ACF of residuals") hist(res, nclass="FD", main="Histogram of residuals") qqnorm(res) qqline(res)

  14. Residual Diagnostics • Portmanteau tests for autocorrelation: These test whether a group of autocorrelations of the residuals are statistically significantly different from zero, i.e. white noise. This differs from Acf/correlograms, where each autocorrelation coefficient is examined separately. • Box-Pierce test and Ljung-Box test are two examples. Box.test(res, lag=10, fitdf=0) Box.test(res,lag=10, fitdf=0, type="Lj")

More Related