Improving Forecasting Accuracy: Bootstrapping and Judgemental Adjustments

Bootstrapping judgemental adjustments to improve forecasting accuracy- judgemental bootstraps vs error bootstraps Robert FildesCentre for Forecasting, Lancaster University, UKPaul GoodwinBath University, UK Research Grants GR/60181/01 and GR/60198/01

Error = Y-F X1 X2 X3 .. .. Xp Actual Observations Y Expert’s Prediction F What is a bootstrap? Based on the cues {X} we derive predictions of Y

The Bootstrapping Literature Armstrong (2001): the bootstrap “provides more accurate forecasts than expert judgement” • But the evidence is primarily cross-sectional • Time series evidence mixed: • Armstrong summarizes early studies • Fildes & Fitzgerald, Economica (on Balance of Payments forecasts, 1983) • Fildes, JoF, 1991 (on construction industry forecasts) • Lawrence & O’Connor, Omega, 1996 (experimental evidence) Why the discrepancies?

Time Series Bootstrapping • e.g: • examination record • credit score card • Fildes & Fitzgerald • data history • Fildes • GDP and new orders • In cross-sectional studies • Cues are constrained, i.e. experts have cue information + priors • Priors may well contain no information (e.g. Linda of Tversky and Kahneman) • In time-series studies • Models are constrained to include only data-based cues • Other cues available from the environment • ‘news’, external info, internal organisational info • Knowledge of ‘unique’ future events

Bias and Inefficiencies • Bias • If the expert forecast is biased, i.e the mean error is non-zero, the bootstrap cannot be optimum • Though it can be better than an alternative • Evidence suggest time series expert forecasts often biased • Optimism bias of analysts • Bias in sales forecasts (Mathews & Diamantopoulos, Lawrence et al) • Inefficiencies • Where a cue variable (or missing variable) is mis-weighted in the judgement • A bootstrap model can never be optimum ex post • Conclusion: • A time series bootstrap is unlikely to be optimum • Potential to improve on a bootstrap

The EPSRC Research Project Company Evidence Data (4 U.K. based companies) Data collected on: Actuals, Statistical System forecast, and Final adjusted forecast • 753 SKUs, Monthly • Company A: Major UK Manufacturer of Laundry, household cleaning and personal care products - 244 SKUs x 22 months -> 3012 triplets • Company B: Major International Pharmaceutical Manufacturer - 213 SKUs x 36 months -> 5428 triplets • Company C: Major International canned Food Manufacturer • 296 SKUs x 20 months -> 2856 triplets • 783 SKUs, Weekly • Company D: Major UK Retailer (over 26000 SKUs) - 104 weeks -> 57688 triplets

Checking for bias in the forecastsStatistical Issues For unbiasedness • Errors heteroscedastic with outliers • Can firms be pooled? • Solutions • Errors normalised by standard deviation of actuals and analysed by size of adjustment

Final Forecasts are biased But are the forecasts inefficient?- the cues: past actuals, past errors and the adjustments The error models (to overcome bias and inefficiencies): Efficiency = all available information is being used effectively i.e. the models have no explanatory power for the jth sku in the ith company • To estimate, normalise, pool across sku, remove outliers, test for seasonality • The result? • The forecasts are inefficient & different companies embody different inefficiencies; R2 low • Positively adjusted forecasts are more inefficient • Persistent optimism bias

Can we model the error to ensure an efficient forecast? improved forecasts The models: This last- the 50/50 model: Blattberg & Hoch We can then use these models to predict the actual and compare with the final forecast = 1*(SysFor)+1*Adjust NB. Standard bootstrap without incorporating the information in ‘Adjust’ cannot perform well from the efficiency evidence.

Note how close to 1,1, the final forecast Note how close to 1, 0.5, 50-50 Man model .985 .987 .989 .450 Weighting the Information Sources Major mis-weightings

Comparative Results: Overall gains, - Major gains with some companies (particularly retailer) To test: split sample – test sample results Accuracy measures: Trimmed MAPE & MdAPE + ranking of these measures for each company

The Results • Consistency over estimation and validation samples • Optimism bias in ‘final forecast’ ensures standard bootstrap inadequate for positive info • Effective use of negative information implies Blattberg-Hoch fails • Optimal bootstrap consistently effective • Final forecast ‘good’ for manufacturers and negative info • Different companies have different propensities for gain Does Multicollinearity affect interpretation of weights? Overall, the adjustment models perform well - substantial improvements are possible - accuracy gains much larger (as high as 20%) than shown in statistical selection comparisons (M3

Conclusions • Standard Bootstrap models not a panacea • Need to eliminate likely biases and inconsistencies • Cue information not readily available (or even non-existant) to model • Mis-weighting of information common • Different companies and different processes lead to differential mis-weightings • For the retailer, mis-weighting so extreme as to raise questions as to motivation • Asymmetric loss: A confusion between forecast and inventory decision • Major Accuracy improvements possible • But implementation issues complex • How do you change the forecasting process to improve the cue weights? See Feature talk tomorrow!

Improving Forecasting Accuracy: Bootstrapping and Judgemental Adjustments

Improving Forecasting Accuracy: Bootstrapping and Judgemental Adjustments

Presentation Transcript

Bootstraps and Jackknives

Accuracy, Precision, Error

Alternative Forecasting Methods: Bootstrapping

Bootstraps

Geothermal Growth “These Bootstraps”

Bootstraps Application and FAFSA Workshop January 18, 2014

Early Inference: Using Bootstraps to Introduce Confidence Intervals

Bootstraps and Scrambles: Letting a Dataset Speak for Itself

America’s Broken Bootstraps

Bootstraps Old and New

Forecasting Forecast Error

Starting Inference with Bootstraps and Randomizations

Bootstraps and Scrambles: Letting Data Speak for Themselves

Judgemental Bias and Housing Choice

Accuracy Precision % Error

Statistical Inference Using Scrambles and Bootstraps

Accuracy vs. Precision

“Bootstraps Uplift”: Messages and Mythologies of the New Industrial Order

Leo TechnoSoft Bootstraps Seven Startups

Accuracy and Error

JPMorgan Chase's Forecasting Accuracy