90 likes | 100 Views
This lecture covers advanced topics in OLS regression, including working with natural logs, quadratic functions, binary variables, heteroskedasticity, instrumental variables, multicollinearity, and handling irrelevant variables.
E N D
Belarusian Economic Research and Outreach Center Econometrics I February 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working with natural logs Suppose that we regress log(salary) of CEOs on log(sales) of their firms # 4. Quadratic functions are also used quite often in applied economics to capture decreasing or increasing marginal effects. For example, consider how wage y depends on experience x When interpreting the effect we need 2 terms
# 5. Consider the effects of experience on wage The first year of experience brings about 30 ¢ In going from 10 to 11 years of experience, wage is predicted to increase by Thus, exper has diminishing returns on wage # 6. # 7. Sometimes the partial effect of the dependent depends on the magnitude of another explanatory variable # 8. R-squared can never fall when a new independent variable is added to regression Thus, adjusted R-squared is often used because it imposes a penalty for adding additional independent variables
# 9. Recall F test for joint significance of a group of variables But what if models are nonnested when neither equation is a special case of the other? # 10. Some of the variables take only 2 values (male of female) – they are called binary How do we incorporate binary variables into regression models? For example, # 11. Example If we compare a woman and a man with the same levels of education, experience, and tenure, the woman earns, on average, $1.81 less per hour than the man. # 12. Suppose we estimate a model that allows for wage differences among four groups: married men, married women, single men, and single women. How many dummies can we include (if we also have an intercept)?
# 13. The linear probability model (LPM) Sometimes the dependent variable is also binary (employed or unemployed) Linear Probability Model (LPM) estimates the response probability as linear in the parameters # 14. Probability of “being in labor force” For example, 10 more years of education increases the probability of being in the labor force by 0.038(10) = 0.38 # 15. Heteroskedasticity In the presence of heteroskedasticity many OLS test statistics are no longer valid Heteroskedasticity-robust procedures are used in this case # 16. Example (robust standard errors are in parenthesis)
# 17. How to test for heteroskedasticity The Breusch-Pagan test for heteroskedasticity (BP test) # 18. Economists are often interested in policy analysis Does job training improve chances of becoming employed? Will the construction of incinerator affect house prices? # 19. Quote from Wooldridge: “Kiel and McClain (1995) studied the effect that a new garbage incinerator had on housing values in North Andover, Massachusetts. The rumor that a new incinerator would be built in North Andover began after 1978, and construction began in 1981. We will use data on prices of houses that sold in 1978 and another sample on those that sold in 1981.” # 20. A naïve analysis would be to use data for 1981 where nearinc is dummy (=1 if a house is near incinerator, 0 – otherwise)
# 21. However, the data for 1978 (prior to rumors about construction) shows # 22. Did building of a new incinerator depress housing values? The key is to compare the coefficient on nearinc changed between 1978 and 1981. Use difference-in-differences estimator using the data pooled over both years # 23. Interpreting the results # 24. The parameter we are interested is on the interaction term y81·nearinc
# 25. Omitted variable bias Suppose that the true relationship is but ability is not observed so we estimate # 26. Thus, the estimator of w will biased (not equal to the correct one) and inconsistent (not converging to the true one as the sample size increases) # 27. Instrumental variables (IV) Suppose that education is correlated with the error term u (because it contains ability) In addition, let z be such a variable that is uncorrelated with u but correlated with x # 28. Stata example Use the data on married working women in MROZ.RAW to estimate the return to education OLS results first
# 29. IV estimation Suppose that father education is a good instrument for educ # 30. # 31. Criticisms of IV estimation Observe that OLS estimate is included in 95% interval for IV estimate Thus, the difference is not statistically significant # 32. Multicollinearity Example of perfect collinearity – constant+female+male Example of multicollinearity
# 33. Consequences of multicollinearity • OLS estimators are still BLUE but would have large covariances Detecting multicollinearity # 34. What to do in the case of multicollinearity? • Sometimes no choice (data deficiency) – so do nothing # 35. Micronumerosity – the problem of small sample size This is a related problem to multicollinearity # 36. Including irrelevant variables in the OLS regression Including an irrelevant variable will not lead to unbiasedness of the intercept and other slope estimators