90 likes | 196 Views
Model averaging as an alternative method of variable selection. Matt VanLandeghem and Grant Sorensen. Problems with variable selection. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters Variance/bias tradeoff. SAS Demo.
E N D
Model averaging as an alternative method of variable selection Matt VanLandeghem and Grant Sorensen
Problems with variable selection • Too many parameters: • Lots of variance in predicted values • Too few parameters: • Missing important parameters • Variance/bias tradeoff
SAS Demo • See SAS website • PROC GLMSELECT • Version 9.3 documentation (not 9.2) • http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_glmselect_sect037.htm
Benefits • Variable importance represented as a selection frequency • Instead of p-value from F test • Estimates based on several “good” models • Distributions of parameter estimates • All of these help us pick the most useful model
Applications • Any field where variable selection techniques are used • Biology (Burnham and Anderson 2002) • Atmospheric sciences (Sloughter et al. 2007) • Econometrics (LeSage and Parent 2007) • Finance (Pesaran et al. 2009) • Psychology (Wasserman 2000) • …and others
Pitfalls • SAS implementation • GLMSELECT • Only GLMs • Experimental • Sensitive to correlated predictors • e.g. Homework #4 • Extension of regression • Typical assumptions still apply • Not a “magic” solution GLM Correlation Assumptions
Alternatives • Other SAS options • AIC or BIC from SAS procedure of choice • Model weights based on AIC or BIC • Averaged “by hand”
References and Further Reading • Burnham, K.P. and D.R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. Springer, New York. • LeSage, J.P and O. Parent. 2007. Bayesian model averaging for spatial economic models. Geographical Analysis 39:241-267. • Peseran, M.H., C. Schleicher, and P. Zaffaroni. 2009. Model averaging in risk management with an application to futures markets. Journal of Empirical Finance 16:280-305. • Sloughter, J.M., A.E. Raftery, T. Gneiting, and C. Fraley. 2007: Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon. Wea. Rev., 135, 3209–3220 • Wasserman, L. 2000. Bayesian model selection and model averaging. Journal of Mathematical Psychology 44:92-107. • Whintey, M. and L. Ngo. 2004. Bayesian model averaging using SAS software. SUGI 29 Proceedings, Paper 203-29. • Pitfall picture:http://www.retrogameoftheday.com/2009/10/retro-game-of-day-pitfall.html • SAS model averaging webpage: http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_glmselect_sect026.htm
SAS Code ods graphics on; procglmselect data = colstd seed=3 plots= all; model y = x1-x9 / selection=stepwise (choose=cv); modelAveragetables=(EffectSelectPct(all) ParmEst(all)) refit(minpct=50nsamples=100) ; run; ods graphics off;