170 likes | 285 Views
Marietta College. Spring 2011 Econ 420: Applied Regression Analysis Dr. Jacqueline Khorassani. Week 15. Tuesday, April 19. Exam 3 : Monday, April 25, 12- 2:30PM Covers Chapters 7, 8, 9 and 10 It is closed book and notes Data set ELECTION in Chapter 15
E N D
Marietta College Spring 2011 Econ 420: Applied Regression Analysis Dr. Jacqueline Khorassani Week 15
Tuesday, April 19 • Exam 3: Monday, April 25, 12- 2:30PM • Covers • Chapters 7, 8, 9 and 10 It is closed book and notes Data set • ELECTION in Chapter 15 (Variables are defined on PP 514-515)
On Thursday • We will have two econ capstone presentations in class • YolienPeeters Determinants of Individual Happiness: Do Social Capital Variables Matter? • Jake Verdoorn The Determinants of the Probability of Enrollment at Marietta College:Does Financial Aid Matter?
Return and discuss Asst 22 • Use the data on Soviet Defense spending (Page 335– Data set: DEFEND Chapter 9) to regress SDH on SDL, UDS and NR only. • Two of you estimated a double log function… I did not ask for it
Conduct a Durbin-Watson test for serial correlation at 5% level of significance DW stat = 1.08, test for positive autocorrelation H0: ρ≤0 (no positive auto) HA: ρ>0 (positive auto) level of significance = 5% Critical d-stat at K = 3 and N = 25 dL = 1.12 dU = 1.66 • DW stat < dL reject H0 evidence of strong positive autocorrelation
2. If you find an evidence for autocorrelation, is it more likely to be pure or impure autocorrelation? Why? • Coefficient of USD is negative (unexpected) • omitted variable bias is likely • Also, functional form maybe wrong • impute autocorrelation is likely
Chapter 10: Homoskedasticity • In this graph, there are 5 different observations on each X (each representing a different value of Y) • Note that the errors on each observation of X have a mean of zero and the same variance across all values of Xs
Heteroskedasticity • Suppose X is income and Y is consumption and we have cross sectional data. • Note that at low levels of income there is not much variation in consumption but at higher levels if income there is more variation in consumption. • The reason is that families that have little income spend a large portion (maybe all) of their income. But families that have a large amount of income vary greatly on how much of their income the spend. • That is, the error on each observation of X comes from a distribution with a mean of zero but a different variance • In this case the variance of error increases as they level of income increases. • Heteroskedasticity is problem that is more common in cross sectional data sets.
Types of Heteroskedasticity • Impure • The theoretical (unobserved) equation does not have a problems but our equation does • Caused by wrong functions form, omitted variables, or data errors 2. Pure • The theoretical (unobserved) equation has a problem
Consequences of Heteroskedasticity • Pure • Unbiased estimates • but wrong standard errors • Impure • Biased estimates • Wrong standard errors
Heteroskedasticity: The White Test • Set the null and alternative hypotheses: Ho: Homoskedasticity Ha: Heteroskedasticity • Estimate the original regression • Use the squared residuals as a dependent variable in a second equation that includes Xs, X2s and product of each pair of Xs • Find nR2 • Find Critical chi-squared on page 597(df = number of independent variables in the second equation) • If nR2 > critical chi-squared reject Ho
Guess what?EViewsconducts the test automatically! Nice! • Estimate the original regression as usual • Let’s estimate VOL as a function of STU, FAC, and SAT • Data set = BOOKS Chapter 10 • Definition of variables on Page 366 • On the regression output click on “View” • Then on “Residual Test” • Then choose White Heteroskedasticity (cross terms included) • On the top of your output, you will see n (number of observations) times R2
Heteroskedasticity Test: White F-statistic 21.00968 Prob. F(9,50) 0.0000 Obs*R-squared 47.45227 Prob. Chi-Square(9) 0.0000 Scaled explained SS 193.2631 Prob. Chi-Square(9) 0.0000 Test Equation: Dependent Variable: RESID^2 Method: Least Squares Sample: 1 60 Included observations: 60 Variable Coefficient Std. Error t-Statistic Prob. C 4161214. 6785281. 0.613271 0.5425 STU -1250.614 449.6164 -2.781513 0.0076 STU^2 -0.004552 0.007027 -0.647796 0.5201 STU*FAC 0.381267 0.211974 1.798648 0.0781 STU*SAT 0.992653 0.441518 2.248272 0.0290 FAC 19587.06 7985.352 2.452873 0.0177 FAC^2 -3.955693 1.543409 -2.562958 0.0134 FAC*SAT -15.58773 8.147749 -1.913133 0.0615 SAT -7732.199 13634.09 -0.567123 0.5732 SAT^2 3.638079 6.787733 0.535979 0.5943 Ho: Homoskedasticity Ha: Heteroskedasticity Critical chi-squared (5%, df = 9) = 16.92 nR2 = 47.45 nR2 > Critical chi-squared reject Ho
Remedies for impure Heteroskedasticity • Check/correct the variables, functional form, ..etc. • Test for heteroskedasticity again • If there is heteroskedasticity
Remember that pure heteroskedasticity • Causes no bias in the estimated coefficients but • Generates wrong standard errors of coefficients wrong t-tests • So we need to find the heteroskedasticity corrected standard errors
EViews does it automatically • All you do is • Quick • Estimate Equation • Type the variables • Click on options • Click on heteroskedasticity consistent covariance (White)
Dependent Variable: VOL Method: Least Squares Variable Coefficient Std. Error t-Statistic Prob. C -1842.007 810.0725 -2.273879 0.0268 FAC 1.7278900.442034 3.908958 0.0003 STU 0.0378500.0261751.4460190.1537 SAT 1.8317020.819800 2.234328 0.0295 Look how standard errors changed Dependent Variable: VOL White Heteroskedasticity-Consistent Standard Errors & Covariance Variable Coefficient Std. Error t-Statistic Prob. C -1842.007 683.6841 -2.694236 0.0093 FAC 1.727890 0.823568 2.098055 0.0404 STU 0.037850 0.057291 0.660662 0.5115 SAT 1.831702 0.738917 2.478902 0.0162