1 / 39

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors. y = b 0 + b 1 x 1 + b 2 x 2 + . . . b k x k + u Definition Estimation Properties. Outline. Omitted variable bias Multiple regression and OLS Measures of fit Sampling distribution of the OLS estimator.

Download Presentation

Linear Regression with Multiple Regressors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Regression with Multiple Regressors y = b0 + b1x1 + b2x2 + . . . bkxk + u Definition Estimation Properties Economics: 332 - 11

  2. Outline • Omitted variable bias • Multiple regression and OLS • Measures of fit • Sampling distribution of the OLS estimator

  3. Omitted Variable Bias (SW Section 6.1) The error u arises because of factors, or variables, that influence Y but are not included in the regression function. There are always omitted variables. Sometimes, the omission of those variables can lead to bias in the OLS estimator.

  4. Omitted variable bias, ctd. The bias in the OLS estimator that occurs as a result of an omitted factor, or variable, is called omitted variablebias. For omitted variable bias to occur, the omitted variable “Z” must satisfy two conditions: The two conditions for omitted variable bias • Z is a determinant of Y (i.e. Z is part of u); and • Z is correlated with the regressor X (i.e. corr(Z,X) ≠ 0) then ρXu≠ 0 and the OLS estimator is biased and is not consistent. Both conditions must hold for the omission of Z to result in omitted variable bias.

  5. Motivation for adding more independent variable (xi) • Omitted Variable Bias • The assumption that has the largest impact is SLR.4: E (U|X) = 0. • U include variables other than X that affects Y. In other words, U includes the variables that are omitted. • Assume that there is one omitted variable Z: • U = β2Z + ε

  6. Motivation for adding more independent variable (xi) Omitted Variable Bias - the case of no bias

  7. Motivation for adding more independent variable (xi) Omitted Variable Bias

  8. Motivation for adding more independent variable (xi) • How to deal with Omitted Variable Bias? • We thus need regressions that have more than one regressor.

  9. AddingManyRegressors: Multiple Regression

  10. Estimation of Multiple Regression

  11. Estimation of Multiple Regression

  12. Estimation of Multiple Regression

  13. Estimation of Multiple Regression

  14. Estimation: A Partialling-Out Form

  15. Estimation: A Partialling-Out Form

  16. Example: Interpreting Multiple Regression • -equationexplaining log(wage). • -educ (years of education). • -exper (years of labor market experience). • -tenure (years with the current employer). • 526 observations on workers. • The coefficient .092 means that, holding exper and tenure fixed, another year of education is predicted to increase log(wage) by .092, which translates into an approximate 9.2 percent [100(.092)] increase in wage.

  17. OLS Fitted Values and Residuals The fitted values for observation i is defined just as in the simple regression case: The residual for observation i is defined just as in the simple regression case:

  18. Simple vs Multiple Reg Estimate

  19. Goodness-of-Fit The R2 is the fraction of the variance explained – same definition as in regression with a single regressor: R2 = = where ESS = , SSR = , TSS = The R2 always increases when you add another regressor (why?) – a bit of a problem for a measure of “fit” The Over-fitting Problem of R-squared

  20. Goodness-of-Fit The (the “adjusted R2”) corrects this problem by “penalizing” you for including another regressor – the does not necessarily increase when you add another regressor. Adjusted R2: = Note that < R2, however if n is large the two will be very close. Adding more variables increase K and may decrease

  21. Unbiasedness of OLS Estimators

  22. Unbiasedness of OLS Estimators

  23. Unbiasedness of OLS Estimators

  24. Examples of Perfect Colinearity

  25. Unbiasedness of OLS Estimators

  26. Unbiasedness of OLS Estimators

  27. Variance of the OLS Estimators • Now we know that the sampling distribution of our estimate is centered around the true parameter • Want to think about how spread out this distribution is • Much easier to think about this variance under an additional assumption

  28. An Additional Assumption

  29. Variance of OLS Estimators Assuming that Var(u|x) = s2 also implies that Var(y| x) = s2

  30. Estimating the Error Variance • We don’t know what the error variance, s2, is, because we don’t observe the errors, ui • What we observe are the residuals, ûi • We can use the residuals to form an estimate of the error variance

  31. Error Variance Estimate (cont) • df = n – (k + 1), or df = n – k – 1 • df (i.e. degrees of freedom) is the (number of observations) – (number of estimated parameters)

  32. Components of OLS Variances • The error variance: a larger s2 implies a larger variance for the OLS estimators • The total sample variation: a larger SSTj implies a smaller variance for the estimators • Linear relationships among the independent variables: a larger Rj2 implies a larger variance for the estimators

  33. Gauss-Markov Theorem

  34. Irrelevant Variables and Omitted Variables • What happens if we include variables in our specification that don’t belong? • There is no effect on our parameter estimate, and OLS remains unbiased • What if we exclude a variable from our specification that does belong? • OLS will usually be biased

  35. Irrelevant Variable

  36. Irrelevant Variable

  37. Omitted Variable

  38. Summary of Direction of Bias

  39. Omitted Variable Bias Summary • Two cases where bias is equal to zero (Unbiased) • b2 = 0, that is x2 doesn’t really belong in model • x1 and x2 are uncorrelated in the sample • If correlation between x2 , x1 and x2 , y is the same direction, bias will be positive • If correlation between x2 , x1 and x2 , y is the opposite direction, bias will be negative

More Related