1 / 16

Chapter 11

Chapter 11. Validation of Regression Models. 11.1 Introduction. What the regression equation was created for, may not always be what it is used for.

Download Presentation

Chapter 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 11 Validation of Regression Models Linear Regression Analysis 5E Montgomery, Peck & Vining

  2. 11.1 Introduction • What the regression equation was created for, may not always be what it is used for. • Model Adequacy Checking – Residual analysis, lack of fit testing, determining influential observations. Checks the fit of the model to the available data. • Model Validation – determining if the model will behave or function as it was intended in the operating environment. Linear Regression Analysis 5E Montgomery, Peck & Vining

  3. 11.2 Validation Techniques • Analysis of model coefficients and predicted values • Check for “inappropriate” signs on the coefficients; • Check for unusual magnitudes on the coefficients; • Check for stability in the coefficient estimates; • Check the predicted values (do they make sense for the nature of the data?) 2. Collection of new data • Usually 15-20 new observations are adequate Linear Regression Analysis 5E Montgomery, Peck & Vining

  4. Example 11.1 The Hald Cement Data Coefficients of x1very similar, coefficients of x2and the interceptmoderately different Difference in predicted values? Linear Regression Analysis 5E Montgomery, Peck & Vining

  5. Which model would you prefer? Linear Regression Analysis 5E Montgomery, Peck & Vining

  6. Example 11.2 The Delivery Time Data Compare the residual mean square to the average squared prediction error Linear Regression Analysis 5E Montgomery, Peck & Vining

  7. New data: Average squared prediction error Linear Regression Analysis 5E Montgomery, Peck & Vining

  8. How does this compare to the R2 for prediction based on PRESS? Linear Regression Analysis 5E Montgomery, Peck & Vining

  9. 11.2 Validation Techniques 3. Data splitting (aka cross validation) • Divide the data into two parts: estimation data and prediction data • The PRESS statistic is an estimate of performance based on data splitting • We can also use PRESS to compute an R2 type statistic for prediction: Linear Regression Analysis 5E Montgomery, Peck & Vining

  10. 11.2 Validation Techniques 3. Data splitting (aka cross validation) • If the time sequence is known, data splitting can be done by time order (common in time series or forecasting) • Other characteristics of the data (are data grouped by operator, machine, location, etc.) • Double cross validation • Drawbacks? • A more formal approach? • The DUPLEX algorithm Linear Regression Analysis 5E Montgomery, Peck & Vining

  11. Example 11.3 The Delivery Time Data A portion of Table 11.3 showing prediction and estimation data determined with DUPLEX, Linear Regression Analysis 5E Montgomery, Peck & Vining

  12. Linear Regression Analysis 5E Montgomery, Peck & Vining

  13. A portion of Table 11.4 is reproduced here. Linear Regression Analysis 5E Montgomery, Peck & Vining

  14. Linear Regression Analysis 5E Montgomery, Peck & Vining

  15. Example 11.3 The Delivery Time Data Linear Regression Analysis 5E Montgomery, Peck & Vining

  16. Linear Regression Analysis 5E Montgomery, Peck & Vining

More Related