1 / 12

Statistics 350 Lecture 27

Statistics 350 Lecture 27. Today. Last Day: Start Chapter 9 (9.1-9.3)…please read 9.1 and 9.2 thoroughly Today: More Chapter 9…stepwise regression. Comment on  levels from last day. Stepwise Selection:. Model Validation. Can be viewed as the final step of the model building process

fadey
Download Presentation

Statistics 350 Lecture 27

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics 350 Lecture 27

  2. Today • Last Day: Start Chapter 9 (9.1-9.3)…please read 9.1 and 9.2 thoroughly • Today: More Chapter 9…stepwise regression

  3. Comment on  levels from last day • Stepwise Selection:

  4. Model Validation • Can be viewed as the final step of the model building process • Up to now, you have built a model using • Decided which variables are in the final model using

  5. Model Validation • It is important to note that the model selected reflects the properties of the data collected • If data were collected at same X’s, would we get the same model? • Want to be sure that the model is capturing main features of the population of interest

  6. Model Validation • Three basic approaches to validate the model

  7. Model Validation • Collection of new data: • If a new set of data is available, you can compute the Mean Square Error of Prediction, MSPR (for each model if there is more than one) by using the model to predict each observation in the new set, and then computing the mean squared deviation between the observed and predicted values

  8. Model Validation • If the MSPR is much bigger than the MSE from the original model-building data set, then that means that the model was overfitting the data (chasing the errors) • If several models are being compared, the one with the smallest MSPR appears to be the best for the new data set • If the MSPRs are similar among all the candidate models, then the choice of a model can be made based on other (nonstatistical) criteria, such as simplicity or interpretability

  9. Model Validation • How much bigger is much bigger than the MSE? • The modeling and validation sets ought to have the same population variance, because they are both (supposedly) drawn from the same population • Therefore, it is reasonable to treat the ratio MSPR/MSE as approximately

  10. Model Validation • What to do if overfitting is indicated?

  11. Model Validation • Comparison to theoretical expectations or earlier results

  12. Model Validation • Data Splitting • What is no new dataset is available?

More Related