1 / 41

Multiple Regression Analysis

Multiple Regression Analysis. y = b 0 + b 1 x 1 + b 2 x 2 + . . . b k x k + u 1. Estimation. Recap of Simple Regression. Assumptions: The population model is linear in parameters: y = b 0 + b 1 x + u [SLR.1]

Download Presentation

Multiple Regression Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Regression Analysis y = b0 + b1x1 + b2x2 + . . . bkxk + u 1. Estimation Slides by K. Clark adapted from P. Anderson

  2. Recap of Simple Regression Assumptions: • The population model is linear in parameters: y = b0 + b1x + u [SLR.1] • There is a random sample of size n, {(xi, yi): i=1, 2, …, n}, from the population model. [SLR.2] • Assume E(u|x) = 0 and thus E(ui|xi) = 0 [SLR.3] • Assume there is variation in the xi[SLR.4] • Var(u|x) = σ2 [SLR.5]

  3. Recap of Simple Regression • Applying the least squares method yields estimators of b0andb1 • R2 = 1-(SSR/SST) is a measure of goodness of fit • Under SLR.1-SLR.4 the OLS estimators are unbiased… • …and, with SLR.5, have variances which can be estimated from the sample…

  4. Recap of Simple Regression • … if we estimate σ2 with SSR/(n-2). • Adding SLR.6, Normality, gives normally distributed errors and allows us to state that … …which is the basis of statistical inference. Alternative, useful, functional forms are possible.

  5. Limitations of Simple Regression • In simple regression we explicitly control for only a single explanatory variable. • We deal with this by assuming SLR.3 • e.g. wage = β0 + β1educ + β2exper + u • Simple regression of wage on educ puts exper in u and assumes educ and u independent. • Simple regression puts a lot of weight on conditional mean independence.

  6. Multiple Regression Model In the population we assume y = b0 + b1x1 + b2x2 + . . . bkxk + u • We are still explaining y. • There are k explanatory variables. • There are k+1 parameters. • k = 1 gets us back to simple regression.

  7. Parallels with Simple Regression • b0 is still the intercept • b1 to bk all called slope parameters • u is still the error term (or disturbance) • Still need to make a zero conditional mean assumption, so now assume that • E(u|x1,x2, …,xk) = 0 or E(u|x) = 0 • Still minimizing the sum of squared residuals, however...

  8. Applying Least Squares

  9. Some Important Notation • xij is the i’th observation on the j’th explanatory variable • e.g. x32 is the 3rd observation on explanatory variable 2 • Not such a problem when we use variable names e.g. educ3

  10. The First Order Conditions

  11. The First Order Conditions • There are k + 1 first order conditions, solution of which is hard. • A matrix approach is “easier” but beyond the scope of ES5611. See Wooldridge, Appendix E. • In general each is a function of all the xj and the y.

  12. Interpreting Multiple Regression

  13. Example Consider the multiple regression model: wage = β0 + β1educ + β2exper + u Using the data in wage3.dta…

  14. Example (Stata output) use http://www.ses.man.ac.uk/clark/es561/wage3 . regress wage educ exper Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 2, 523) = 75.99 Model | 1612.2545 2 806.127251 Prob > F = 0.0000 Residual | 5548.15979 523 10.6083361 R-squared = 0.2252 -------------+------------------------------ Adj R-squared = 0.2222 Total | 7160.41429 525 13.6388844 Root MSE = 3.257 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .6442721 .0538061 11.97 0.000 .5385695 .7499746 exper | .0700954 .0109776 6.39 0.000 .0485297 .0916611 _cons | -3.390539 .7665661 -4.42 0.000 -4.896466 -1.884613 ------------------------------------------------------------------------------

  15. Example - interpretation The fitted regression line is: • Holding experience fixed, a one year increase in education increases the hourly wage by 64 cents. • Holding education fixed, a one year increase in experience increases the hourly wage by 7 cents. • When education and experience are zero wages are predicted to be -$3.30!

  16. Simple vs Multiple Estimates

  17. Assumptions for Unbiasedness Population model is linear in parameters: y = b0 + b1x1 + b2x2 +…+ bkxk+ u [MLR.1] {(xi1, xi2,…, xik, yi): i=1, 2, …, n} is a random sample from the population model, so that yi = b0 + b1xi1 + b2xi2 +…+ bkxik+ ui [MLR.2] E(u|x1, x2,… xk) = 0, implying that all of the explanatory variables are uncorrelated with the error [MLR.3] None of the x’s is constant, and there are no exact linear relationships among them [MLR.4]

  18. Unbiasedness of OLS Under these assumptions: All of the OLS estimators of the parameters of the multiple regression model are unbiased estimators. This is not generally true if any one of MLR.1-MLR.4 are violated. Note MLR.4 more involved than SLR.4 and rules out perfect multicollinearity. MLR.3 more plausible than SLR.3?

  19. Too Many or Too Few Variables • What happens if we include variables in our specification that don’t belong? • OLS estimators remain unbiased • There will, however, be an impact on the variance of the estimators • What if we exclude a variable from our specification that does belong? • OLS will usually be biased

  20. Omitted Variable Bias

  21. Omitted Variable Bias (cont)

  22. Omitted Variable Bias (cont)

  23. Omitted Variable Bias (cont)

  24. Summary of Direction of Bias

  25. Omitted Variable Bias Summary • Two cases where bias is equal to zero • b2 = 0, that is x2 doesn’t really belong in model • x1 and x2 are uncorrelated in the sample • If correlations between x2 , x1 and x2 , y are the same sign, bias will be positive • If correlations between x2 , x1 and x2 , y are the opposite sign, bias will be negative

  26. Omitted Variable Bias: Example Suppose the model log(wage) = b0 + b1educ + b2abil + u satisfies MLR.1-MLR.4 abil is typically hard to observe so we estimate log(wage) = b0 + b1educ + u

  27. The More General Case • Technically, can only sign the bias for the more general case if all of the included x’s are uncorrelated • Typically, then, we work through the bias assuming the x’s are uncorrelated, as a useful guide even if this assumption is not strictly true

  28. Variance of the OLS Estimators Assume Var(u|x1, x2,…, xk) = s2 (Homoskedasticity) Let x stand for (x1, x2,…xk) Assuming that Var(u|x) = s2 also implies that Var(y| x) = s2 The 4 assumptions for unbiasedness, plus this homoskedasticity assumption are known as the Gauss-Markov assumptions

  29. Variance of OLS (cont)

  30. Components of OLS Variances • The error variance: a larger s2 implies a larger variance for the OLS estimators • The total sample variation: a larger SSTj implies a smaller variance for the estimators • Linear relationships among the independent variables: a larger Rj2 implies a larger variance for the estimators (multicollinearity)

  31. Misspecified Models

  32. Misspecified Models (cont) • While the variance of the estimator is smaller for the misspecified model, unless b2 = 0 the misspecified model is biased • Corrolary: including an extraneous or irrelevant variable cannot decrease the variance of the estimator • As the sample size grows, the variance of each estimator shrinks to zero, making the variance difference less important

  33. The Gauss-Markov Theorem • Given our 5 Gauss-Markov Assumptions it can be shown that OLS is “BLUE” • Best • Linear • Unbiased • Estimator • Thus, if the assumptions hold, use OLS

  34. Estimating the Error Variance • df = n – (k + 1), or df = n – k – 1 • df (i.e. degrees of freedom) is the (number of observations) – (number of estimated parameters)

  35. Goodness-of-Fit R2 can also be used in the multiple regression context. R2 = SSE/SST = 1 – SSR/SST 0  R2  1 R2 has the same interpretation - the proportion of the variation in y explained by the independent (x) variables

  36. More about R-squared • R2 can never decrease when another independent variable is added to a regression, and usually will increase • This is because SSR is non-increasing in k • Because R2will usually increase with the number of independent variables, it is not a good way to compare models

  37. Adjusted R-Squared (Section 6.3) An alternative measure of goodness of fit is sometimes used The adjusted R2 takes into account the number of variables in a model, and may decrease

  38. Adjusted R-Squared (cont) It’s easy to see that the adjusted R2 is just (1 – R2)(n – 1) / (n – k – 1), but Stata will give you both R2 and adj-R2 You can compare the fit of 2 models (with the same y) by comparing the adj-R2 You cannot use the adj-R2 to compare models with different y’s (e.g. y vs. ln(y))

  39. Stata output again use http://www.ses.man.ac.uk/clark/es561/wage3 . regress wage educ exper Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 2, 523) = 75.99 Model | 1612.2545 2 806.127251 Prob > F = 0.0000 Residual | 5548.15979 523 10.6083361 R-squared = 0.2252 -------------+------------------------------ Adj R-squared = 0.2222 Total | 7160.41429 525 13.6388844 Root MSE = 3.257 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .6442721 .0538061 11.97 0.000 .5385695 .7499746 exper | .0700954 .0109776 6.39 0.000 .0485297 .0916611 _cons | -3.390539 .7665661 -4.42 0.000 -4.896466 -1.884613 ------------------------------------------------------------------------------

  40. Summary: Multiple Regression • Many of the principles the same as simple regression • Functional form results the same • Need to be aware of the role of the assumptions • Have only focused on estimation; consider inference in the next lecture

  41. Next Time • Remember: no lectures or classes in Reading Week (week beginning 1st November) • Read Wooldridge! • Next topic is inference in (multiple) regression (Chapter 4)

More Related