1 / 119

Chapter 12: Multiple Regression and Model Building

Chapter 12: Multiple Regression and Model Building. Where We’ve Been. Introduced the straight-line model relating a dependent variable y to an independent variable x Estimated the parameters of the straight-line model using least squares Assesses the model estimates

Download Presentation

Chapter 12: Multiple Regression and Model Building

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 12: Multiple Regression and Model Building

  2. Where We’ve Been • Introduced the straight-line model relating a dependent variable y to an independent variable x • Estimated the parameters of the straight-line model using least squares • Assesses the model estimates • Used the model to estimate a value of y given x McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  3. Where We’re Going • Introduce a multiple-regression model to relate a variable y to two or more x variables • Present multiple regression models with both quantitative and qualitative independent variables • Assess how well the multiple regression model fits the sample data • Show how analyzing the model residuals can help detect problems with the model and the necessary modifications McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  4. 12.1: Multiple Regression Models McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  5. 12.1: Multiple Regression Models • Analyzing a Multiple-Regression Model Step 1: Hypothesize the deterministic portion of the model by choosing the independent variables x1, x2, … , xk. Step 2: Estimate the unknown parameters  0, 1, 2, … , k . Step 3: Specify the probability distribution of  and estimate the standard deviation  of this distribution. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  6. 12.1: Multiple Regression Models • Analyzing a Multiple-Regression Model Step 4: Check that the assumptions about  are satisfied; if not make the required modifications to the model. Step 5: Statistically evaluate the usefulness of the model. Step 6: If the model is useful, use it for prediction, estimation and other purposes. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  7. 12.1: Multiple Regression Models • Assumptions about the Random Error  • The mean is equal to 0. • The variance is equal to  2. • The probability distribution is a normal distribution. • Random errors are independent of one another. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  8. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters A First-Order Model in Five Quantitative Independent Variables where x1, x2, … , xk are all quantitative variables that are not functions of other independent variables. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  9. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters A First-Order Model in Five Quantitative Independent Variables The parameters are estimated by finding the values for the  ‘s that minimize McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  10. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters A First-Order Model in Five Quantitative Independent Variables The parameters are estimated by finding the values for the  ‘s that minimize Only a truly talented mathematician (or geek) would choose to solve the necessary system of simultaneous linear equations by hand. In practice, computers are left to do the complicated calculation required by multiple regression models. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  11. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters • A collector of antique clocks hypothesizes that the auction price can be modeled as McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  12. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters • Based on the data in Table 12.1, the least squares prediction equation, the equation that minimizes SSE, is McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  13. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters • Based on the data in Table 12.1, the least squares prediction equation, the equation that minimizes SSE, is The estimate for  1 is interpreted as the expected change in y given a one-unit change in x1 holding x2 constant The estimate for  2 is interpreted as the expected change in y given a one-unit change in x2 holding x1 constant McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  14. 12.2: The First-Order Model: Estimating and Making Inferences about the  Parameters • Based on the data in Table 12.1, the least squares prediction equation, the equation that minimizes SSE, is Since it makes no sense to sell a clock of age 0 at an auction with no bidders, the intercept term has no meaningful interpretation in this example. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  15. 12.2: The First-Order Model:Estimating and Making Inferences about the  Parameters One-Tailed Test Two-Tailed Test Test of an Individual Parameter Coefficient in the Multiple Regression Model McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  16. 12.2: The First-Order Model:Estimating and Making Inferences about the  Parameters Test of the Parameter Coefficient on the Number of Bidders McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  17. 12.2: The First-Order Model:Estimating and Making Inferences about the  Parameters Test of the Parameter Coefficient on the Number of Bidders Since t* > t, reject the null hypothesis. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  18. 12.2: The First-Order Model:Estimating and Making Inferences about the  Parameters A 100(1-)% Confidence Interval for a  Parameter McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  19. 12.2: The First-Order Model:Estimating and Making Inferences about the  Parameters A 100(1-)% Confidence Interval for  1 McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  20. 12.2: The First-Order Model:Estimating and Making Inferences about the  Parameters A 100(1-)% Confidence Interval for  1 Holding the number of bidders constant, the result above tells us that we can be 90% sure that the auction price will rise between $11.20 and $14.28 for each 1-year increase in age. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  21. 12.3: Evaluating Overall Model Utility Reject H 0 for i Do Not Reject H 0 for i There may be no relationship between y and xi Type II error occurred The relationship between y and xi is more complex than a straight-line relationship • Evidence of a linear relationship between y and xi McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  22. 12.3: Evaluating Overall Model Utility • The multiple coefficient of determination, R2,measures how much of the overall variation in y is explained by the least squares prediction equation. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  23. 12.3: Evaluating Overall Model Utility • High values of R2 suggest a good model, but the usefulness of R2falls as the number of observations becomes close to the number of parameters estimated. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  24. 12.3: Evaluating Overall Model Utility Ra2 adjusts for the number of observations and the number of parameter estimates. It will always have a value no greater than R2. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  25. 12.3: Evaluating Overall Model Utility McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  26. 12.3: Evaluating Overall Model Utility Rejecting the null hypothesis means that something in your model helps explain variations in y, but it may be that another model provides more reliable estimates and predictions. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  27. 12.3: Evaluating Overall Model Utility A collector of antique clocks hypothesizes that the auction price can be modeled as McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  28. 12.3: Evaluating Overall Model Utility A collector of antique clocks hypothesizes that the auction price can be modeled as Something in the model is useful, but the F-test can’t tell us which x-variables are individually useful. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  29. 12.3: Evaluating Overall Model Utility • Checking the Utility of a Multiple-Regression Model • Use the F-test to conduct a test of the adequacy of the overall model. • Conduct t-tests on the “most important”  parameters. • Examine Ra2 and 2s to evaluate how well the model fits the data. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  30. 12.4: Using the Model for Estimation and Prediction • The model of antique clock prices can be used to predict sale prices for clocks of a certain age with a particular number of bidders. • What is the mean sale price for all 150-year-old clocks with 10 bidders? McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  31. 12.4: Using the Model for Estimation and Prediction The average value of all clocks with these characteristics can be found by using the statistical software to generate a confidence interval. (See Figure 12.7) In this case, the confidence interval indicates that we can be 95% sure that the average price of a single 150-year-old clock sold at auction with 10 bidders will be between $1,154.10 and $1,709.30. • What is the mean auction sale price for a single 150-year-old clock with 10 bidders? McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  32. 12.4: Using the Model for Estimation and Prediction McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  33. 12.4: Using the Model for Estimation and Prediction • What is the mean sale price for a single 50-year-old clock with 2 bidders? McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  34. 12.4: Using the Model for Estimation and Prediction • What is the mean sale price for a single 50-year-old clock with 2 bidders? Since 50 years-of-age and 2 bidders are both outside of the range of values in our data set, any prediction using these values would be unreliable. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  35. 12.5: Model Building: Interaction Models • In some cases, the impact of an independent variable xi on y will depend on the value of some other independent variable xk. • Interaction models include the cross-products of independent variables as well as the first-order values. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  36. 12.5: Model Building: Interaction Models McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  37. 12.5: Model Building: Interaction Models • In the antique clock auction example, assume the collector has reason to believe that the impact of age (x1) on price (y) varies with the number of bidders (x2) . • The model is now y = 0 + 1x1 + 2x2 + 3x1x2 +  . McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  38. 12.5: Model Building: Interaction Models McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  39. 12.5: Model Building: Interaction Models • In the antique clock auction example, assume the collector has reason to believe that the impact of age (x1) on price (y) varies with the number of bidders (x2) . • The model is now y = 0 + 1x1 + 2x2 + 3x1x2 +  . McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  40. 12.5: Model Building: Interaction Models The MINITAB results are reported in Figure 12.11 in the text. • In the antique clock auction example, assume the collector has reason to believe that the impact of age (x1) on price (y) varies with the number of bidders (x2) . • The model is now y = 0 + 1x1 + 2x2 + 3x1x2 +  . McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  41. 12.5: Model Building: Interaction Models • In the antique clock auction example, assume the collector has reason to believe that the impact of age (x1) on price (y) varies with the number of bidders (x2) . • The model is now y = 0 + 1x1 + 2x2 + 3x1x2 +  . McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  42. 12.5: Model Building: Interaction Models Once the interaction term has passed the t-test, it is unnecessary to test the individual independent variables. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  43. 12.6: Model Building: Quadratic and Other Higher Order Models • A quadratic (second-order) model includes the square of an independent variable: y = 0 + 1x+ 2x2 + . This allows more complex relationships to be modeled. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  44. 12.6: Model Building: Quadratic and Other Higher Order Models • A quadratic (second-order) model includes the square of an independent variable: y = 0 + 1x+ 2x2 + . 1 is the shift parameter and 2 is the rate of curvature. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  45. 12.6: Model Building: Quadratic and Other Higher Order Models • Example 12.7 considers whether home size (x) impacts electrical usage (y) in a positive but decreasing way. • The MINITAB results are shown in Figure 12.13. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  46. 12.6: Model Building: Quadratic and Other Higher Order Models McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  47. 12.6: Model Building: Quadratic and Other Higher Order Models • According to the results, the equation that minimizes SSE for the 10 observations is McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  48. 12.6: Model Building: Quadratic and Other Higher Order Models McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  49. 12.6: Model Building: Quadratic and Other Higher Order Models • Since 0 is not in the range of the independent variable (a house of 0 ft2?), the estimated intercept is not meaningful. • The positive estimate on 1indicates a positive relationship, although the slope is not constant (we’ve estimated a curve, not a straight line). • The negative value on 2indicates the rate of increase in power usage declines for larger homes. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

  50. 12.6: Model Building: Quadratic and Other Higher Order Models • The Global F-Test • H0: 1= 2= 0 • Ha: At least one of the coefficients ≠ 0 • The test statistic is F = 189.71, p-value near 0. • Reject H0. McClave: Statistics, 11th ed. Chapter 12: Multiple Regression and Model Building

More Related