1 / 24

chapter eight

chapter eight. Multiple Regression: Estimation and Hypothesis Testing. Three Variable Model. Any individual Y value expressed as a sum of a systematic or deterministic component and a nonsystematic or random component B 2 and B 3 are partial regression coefficients.

jase
Download Presentation

chapter eight

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. chapter eight Multiple Regression: Estimation and Hypothesis Testing

  2. Three Variable Model • Any individual Y value expressed as a sum of a systematic or deterministic component and a nonsystematic or random component • B2 and B3 are partial regression coefficients

  3. Partial slope or regression coefficients • For example, B2 measures the change in the mean value of Y per unit change in X2, holding X3 constant. • This reflects the partial effect of one explanatory variable on the mean value of the dependent variable when the values of the other explanatory variables are held constant. • Regression can isolate the effect of each X variable on Y from all the other X variables.

  4. Assumptions of the Multiple LRM • The regression model is linear in the parameters • X2 and X3 are uncorrelated with u (always true for nonstochastic X’s) • E(ui) = 0 • Homoscedasticity: var(ui) = σ2 • No autocorrelation: cov(ui, uj) = 0, i ≠ j • No exact collinearity between X2 and X3 • For hypothesis testing: ui~ N(0, σ2)

  5. Multicollinearity • Two variables are collinear if one variable is an exact linear function of the other • X2i = 3 + 2X3i or X2i = 4X3i • In this case, a two variable model collapses to a one variable model as X2 and X3 are not independent • the individual effects of X2 and X3 cannot be isolated • B2 and B3 cannot be estimated • Multicollinearity refers to multiple cases of collinearity in models with more than 2 explanatory variables • Perfect collinearity is rare, but high or near perfect collinearity is common.

  6. OLS Estimators • Choose values for unknown parameters so as to minimize the RSS • As in the two-variable case, calculus and some algebra yield the formulas for the intercept and slope parameters • Equations for b2 and b3 are symmetric with common denominators

  7. Variances and Standard Errors

  8. OLS Estimator of Population Variance

  9. Properties of OLS Estimators • OLS estimators for the multiple linear regression model are BLUE • Linear • Unbiased • Efficient: minimum variance among linear, unbiased estimators

  10. Goodness of Fit, R2, and Hypothesis Testing • Multiple Coefficient of Determination, R2 • TSS = ESS + RSS • R2 = ESS/TSS or 1-(RSS/TSS) • If ui~ N(0, σ2), then the bi are normally distributed with means Bi, i =1, 2, 3 • t = (bi – Bi)/se(bi) ~ tn-3 • Example: Antique Clock Auction

  11. Testing B2 = B3 = 0, or R2 = 0 • H0: B2 = B3 = 0 equivalent to H0: R2 = 0 • Test of the overall significance of the multiple regression • Degrees of freedom • TSS: n – 1 always • RSS: n – k • ESS: k – 1 • F = (ESS/d.f.)/(RSS/d.f.) ~ F(k-1),(n-k) • (Variance explained by X1, X2)/(unexplained variance) • See Tables 8-1 and 8-2 (Antique Clock Example).

  12. Table 8-1 ANOVA table for the three-variable regression.

  13. Table 8-2 ANOVA table for the clock auction price example.

  14. Relationship between F and R2 • F and R2 are directly related • R2 = 0, F = 0 • Larger R2, larger F • R2 = 1, F is infinite • H0: B2 = B3 = 0 equivalent to H0: R2 = 0 • See Table 8-3.

  15. Table 8-3 ANOVA table in terms of R2.

  16. Specification Bias • Suppose we ran a regression of Antique Clock Auction prices against age and number of bidders separately. • How could we compare these regressions to the multiple regression using both age and number of bidders as explanatory variables? • Since both age and number of bidders contribute significantly to the explanation of prices in these regressions (by the t and F tests), the one and two-variable models summarized in Table 8-4 are mis-specified.

  17. Table 8-4 A comparison of four models of antique clock auction prices.

  18. Comparing R2 Values: The Adjusted R2 • Note Two Properties of R2 • R2 from separate regressions on the same dependent variable but with different explanatory variables will not take into account the different degrees of freedom (k – 1, n – k, etc.) • R2 increases whenever more explanatory variables are added to a regression equation. • Solution: The Adjusted R2 • Adj. R2 = 1 – {(1 – R2)[(n – 1)/(n – k)]}

  19. Properties of the Adjusted R2 • If k > 1, adj. R2< unadj. R2 • Unadjusted R2 is always positive, but the adjusted R2 can become negative (when R2 is very small). • The adj. R2 can be compared across regressions with the same dependent variable. • It is common practice to add explanatory variables as long as the adj. R2 increases.

  20. A Problem with Adj. R2 • The adj. R2 increases as additional explanatory variables are added, if the |t| > 1 for the null hypothesis that the last added variable’s coefficient = 0. See Table 8-4. • Note that the square of the t-value for the age coefficient in row 2 is approx. the F-value. • This is because tk2 ~ F1,k • As explanatory variables are added, both adj. and unadj. R2 increase. • Should both age and number of bidders be added? Yes! To decide, use the F-test.

  21. Restricted Least Squares • Use Table 8-4 to test for whether both age and number of bidders should be added. • Call regression in row 1 the restricted regression • Call regression in row 4 the unrestricted regression • Calculate F as shown, where m is number of restrictions, n is obs and k is number of parameters in unrestricted reg.

  22. Restricted Least Squares • Use results in 8-4 to calculate F • P(F > 117)<<0.01 • Use of both age and number of bidders as explanatory variables adds significantly to the explanatory power of the regression.

  23. How to decide to add explanatory variables • Single variable (one variable at a time), add if • t-test is significant (H0: coefficient = 0) • Adj. R2 increases • For groups of variables • Restricted least squares F-test • “Unrestricted” regression has more exp. variables • “Restricted” regression has fewer exp. Variables • “number of restrictions” m is the difference in the number of parameters (coefficients) in the two equations

  24. Example from ElectricExcel2.xls • Calculate F • P(F3,31> 0.86) • > 0.25 • Not significant • At less than 10% • Adding the 3 exp. var. does not significantly increase explanatory power

More Related