1 / 19

Ch. 14: The Multiple Regression Model building

Ch. 14: The Multiple Regression Model building. Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ). Multiple Regression Model with k Independent Variables:. Population slopes. Random Error. Y-intercept. Estimated (or predicted)

addison
Download Presentation

Ch. 14: The Multiple Regression Model building

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch. 14: The Multiple Regression Model building Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (Xi) Multiple Regression Model with k Independent Variables: Population slopes Random Error Y-intercept

  2. Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients • The coefficients of the multiple regression model are estimated using sample data with k independent variables • Interpretation of the Slopes: (referred to as a Net Regression Coefficient) • b1=The change in the mean of Y per unit change in X1, taking into account the effect of X2 (or net of X2) • b0 Y intercept. It is the same as simple regression.

  3. Graph of a Two-Variable Model • Three dimension Y Slope for variable X1 X2 Slope for variable X2 X1

  4. Example: • Simple Regression Results  • Multiple Regression Results  • Check the size and significance level of the coefficients, the F-value, the R-Square, etc. You will see what the “net of “ effects are.

  5. Using The Equation to Make Predictions • Predict the appraised value at average lot size (7.24) and average number of rooms (7.12). • What is the total effect from 2000 sf increase in lot size and 2 additional rooms?

  6. Coefficient of Multiple Determination, r2 and Adjusted r2 • Reports the proportion of total variation in Y explained by all X variables taken together (the model) • Adjusted r2 • r2 never decreases when a new X variable is added to the model • This can be a disadvantage when comparing models

  7. What is the net effect of adding a new variable? • We lose a degree of freedom when a new X variable is added • Did the new X variable add enough explanatory power to offset the loss of one degree of freedom? • Shows the proportion of variation in Y explained by all X variables adjusted for the number of Xvariables used (where n = sample size, k = number of independent variables) • Penalize excessive use of unimportant independent variables • Smaller than r2 • Useful in comparing among models

  8. Multiple Regression Assumptions • Assumptions: • The errors are normally distributed • Errors have a constant variance • The model errors are independent • Errors (residuals) from the regression model: ei = (Yi – Yi) • These residual plots are used in multiple regression: • Residuals vs. Yi • Residuals vs. X1i • Residuals vs. X2i • Residuals vs. time (if time series data)

  9. Y Two variable model Sample observation Yi Residual = ei = (Yi – Yi) < Yi < x2i X2 x1i < The best fit equation, Y , is found by minimizing the sum of squared errors, e2 X1

  10. Are Individual Variables Significant? • Use t-tests of individual variable slopes • Shows if there is a linear relationship between the variable Xi and Y; Hypotheses: • H0: βi = 0 (no linear relationship) • H1: βi≠ 0 (linear relationship does exist between Xi and Y) • Test Statistic: • Confidence interval for the population slope βi

  11. Is the Overall Model Significant? • F-Test for Overall Significance of the Model • Shows if there is a linear relationship between all of the X variables considered together and Y • Use F test statistic; Hypotheses: H0: β1 = β2 = … = βk = 0 (no linear relationship) H1: at least one βi≠ 0 (at least one independent variable affects Y) • Test statistic:

  12. Testing Portions of the Multiple Regression Model • To find out if inclusion of an individual Xj or a set of Xs, significantly improves the model, given that other independent variables are included in the model • Two Measures: • Partial F-test criterion • The Coefficient of Partial Determination

  13. Contribution of a Single Independent Variable Xj SSR(Xj | all variables except Xj) = SSR (all variables) – SSR(all variables except Xj) • Measures the contribution of Xj in explaining the total variation in Y (SST) • consider here a 3-variable model: SSR(X1 | X2 and X3) = SSR (all variablesX1-x3) – SSR(X2 and X3) SSRR Model SSRUR Model

  14. The Partial F-Test Statistic • Consider the hypothesis test: H0: variable Xj does not significantly improve the model after all other variables are included H1: variable Xj significantly improves the model after all other variables are included Note that the numerator is the contribution of Xj to the regression. If Actual F Statistic is > than the Critical F, then Conclusion is: Reject H0; adding X1 does improve model

  15. Coefficient of Partial Determination for one or a set of variables • Measures the proportion of total variation in the dependent variable (SST) that is explained by Xj while controlling for (holding constant) the other explanatory variables

  16. Using Dummy Variables • A dummy variable is a categorical explanatory variable with two levels: • yes or no, on or off, male or female • coded as 0 or 1 • Regression intercepts are different if the variable is significant • Assumes equal slopes for other variables • If more than two levels, the number of dummy variables needed is (number of levels - 1)

  17. Fire Place No Fire Place • Different Intercepts, same slope Fire Place (X2 = 1) Y (sales) If H0: β2 = 0 is rejected, then “Fire Place” has a significant effect on Values b0 + b2 No Fire place (X2 = 0) b0

  18. Interaction Between Explanatory Variables • Hypothesizes interaction between pairs of X variables • Response to one X variable may vary at different levels of another X variable • Contains two-way cross product terms • Effect of Interaction • Without interaction term, effect of X1 on Y is measured by β1 • With interaction term, effect of X1 on Y is measured by β1 + β3 X2 • Effect changes as X2 changes

  19. = 1 + 2X1 + 3X2 + 4X1X2 • Example: Suppose X2 is a dummy variable and the estimated regression equation is Y X2 = 1: Y = 1 + 2X1 + 3(1) + 4X1(1) = 4 + 6X1 X2 = 0: Y = 1 + 2X1 + 3(0) + 4X1(0) = 1 + 2X1 X1 0 0.5 1 1.5 Slopes are different if the effect of X1 on Y depends on X2 value

More Related