130 likes | 146 Views
Some Model Selection Criteria for Regression. SSE, R 2 , Adjusted R 2 , and MSE. SSE decreases as variable is added to a model R 2 increases as variable is added to a model. Adjusted R 2 and MSE take the number of predictors into account (K=number of predictors)
E N D
SSE, R2, Adjusted R2, and MSE • SSE decreases as variable is added to a model • R2 increases as variable is added to a model • Adjusted R2 and MSE take the number of predictors into account (K=number of predictors) • Useful for comparing models with different number of variables
Cp Criteria • Mallows (1973) • Let p= K+1 • Choose the model that Cpis low and close to p
Akiake Criteria • An Information Criteria for normal regression model (Akaike 1973) • p=K+1 • Choose the model that minimizes AICp • Derived from information theory
Schwartz Criteria • Bayesian Information Criteria for normal regression model (Schwartz 1978) • Choose the model that minimizes BICp • Bayes factor • Also denoted as SICp
Some Model Selection Procedures • All subset regression • Best subset regression • Forward selection • Backward elimination • Stepwise regression • Out of sample prediction • Forecasting • Cross-validation
Prediction error • Validation of an estimated model by out of sample prediction (forecast) • Leave one out • Split sampling • New data • Prediction (forecast) error error = actual - forecast
Leave out one • Leave one observation, i, out • Estimate the model using the n-1 remaining data • Predict yiby • Compute the prediction error • Repeat for all i = 1,2, . . . , n
Measures of overall prediction error • Prediction error sum of square • Mean square prediction error • Root Mean square prediction error
Measures of overall prediction error • Mean absolute prediction error • Relative (percent) mean absolute prediction error
Split sampling • Divide the sample into two subsamples • Model estimation subsample (size n1) • Model validation subsample (size n2) • Cross-sectional data • select cases randomly • Time series data • Use period t = 1, …, T1 data for estimation • Use period t = T1+1, …, T for validation
Measures of overall prediction error • Mean square prediction error • Root Mean square prediction error
Measures of overall prediction error • Mean absolute prediction error • Relative (percent) mean absolute prediction error