120 likes | 214 Views
Statistics 350 Lecture 27. Today. Last Day: Start Chapter 9 (9.1-9.3)…please read 9.1 and 9.2 thoroughly Today: More Chapter 9…stepwise regression. Comment on levels from last day. Stepwise Selection:. Model Validation. Can be viewed as the final step of the model building process
E N D
Today • Last Day: Start Chapter 9 (9.1-9.3)…please read 9.1 and 9.2 thoroughly • Today: More Chapter 9…stepwise regression
Comment on levels from last day • Stepwise Selection:
Model Validation • Can be viewed as the final step of the model building process • Up to now, you have built a model using • Decided which variables are in the final model using
Model Validation • It is important to note that the model selected reflects the properties of the data collected • If data were collected at same X’s, would we get the same model? • Want to be sure that the model is capturing main features of the population of interest
Model Validation • Three basic approaches to validate the model
Model Validation • Collection of new data: • If a new set of data is available, you can compute the Mean Square Error of Prediction, MSPR (for each model if there is more than one) by using the model to predict each observation in the new set, and then computing the mean squared deviation between the observed and predicted values
Model Validation • If the MSPR is much bigger than the MSE from the original model-building data set, then that means that the model was overfitting the data (chasing the errors) • If several models are being compared, the one with the smallest MSPR appears to be the best for the new data set • If the MSPRs are similar among all the candidate models, then the choice of a model can be made based on other (nonstatistical) criteria, such as simplicity or interpretability
Model Validation • How much bigger is much bigger than the MSE? • The modeling and validation sets ought to have the same population variance, because they are both (supposedly) drawn from the same population • Therefore, it is reasonable to treat the ratio MSPR/MSE as approximately
Model Validation • What to do if overfitting is indicated?
Model Validation • Comparison to theoretical expectations or earlier results
Model Validation • Data Splitting • What is no new dataset is available?