1 / 45

Economics 105: Statistics

Economics 105: Statistics. Unit 3 Review is due by 4:30 p.m . today. Risks in Model Building: Samples Can Mislead. Remember: we are using sample data About 5% of the time, our sample will include random observations of X ’ s that result in betahat ’ s that meet classical hypothesis tests

Download Presentation

Economics 105: Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Economics 105: Statistics Unit 3 Review is due by 4:30 p.m.today

  2. Risks in Model Building:Samples Can Mislead • Remember: we are using sample data • About 5% of the time, our sample will include random observations of X’s that result in betahat’s that meet classical hypothesis tests • Or the beta’s may be important, but the sample data will randomly include observations of X that do not meet the statistical tests • That’s why we rely on theory, prior hypotheses, and replication

  3. I know! We can save the model, but not until Eco205. Holy endogeneity, Batman! Violations of GM Assumptions Assumption Violation Wrong functional form Omit Relevant Variable (Include Irrelevant Var) Errors in Variables Sample selection bias, Simultaneity bias Heteroskedasticerrors Homoskedasticerrors No serial correlation of errors There exists serial correlation in errors Model is linear in parameters, the betas (4) i.i.d. sample of data (5)

  4. Violation of (2) • Causes of heteroskedasticity • Learning • Discretion • Outliers • Model misspecification • should include quadratic • incorrect functional form (ln-ln, ln-linear) • Skewed explanatory variables • income, wage, education Causes of Heteroskedasticity

  5. Violation of (2) • Causes of heteroskedasticity • Learning Causes of Heteroskedasticity

  6. Violation of (2) • Causes of heteroskedasticity • Discretion Causes of Heteroskedasticity

  7. Violation of (3) • Causes of heteroskedasticity • Learning • Discretion • Outliers • Model misspecification • should include quadratic • incorrect functional form (ln-ln, ln-linear) • Skewed explanatory variables • income, wage, education Causes of Heteroskedasticity

  8. Example: Quadratic Model (continued) • Simple regression results: Purity = -11.283 + 5.985 Time ^ t statistic, F statistic, and r2 are all high, but the residuals are not random:

  9. Example: Quadratic Model • Quadratic regression results: • Purity = 1.539 + 1.565 Time + 0.245 (Time)2 (continued) ^ The quadratic term is significant and improves the model: r2 is higher and SYX is lower, residuals are now random

  10. Consequences of Heteroskedasticity Unbiased estimators are produced when OLS is used when the errors are heteroskedastic However, the standard errors are incorrect and OLS is no longer BLUE Any hypothesis tests conducted could yield erroneous results

  11. Weighted Least Squares • If OLS estimators are not BLUE in the presence of heteroskedasticity, what are the best estimators? • Can weight the observations so that more weight is put on observations associated with levels of X having a smaller error variance • Transform the model so that the errors no longer exhibit heteroskedasticity • The basic model with heteroskedasticity is • Yi = 0 + 1Xi +i • Var(i) = i2

  12. Weighted Least Squares • Dividing each observation by the associated standard deviation of the error transforms the model • OLS estimators for this model are BLUE in the presence of heteroskedasticity

  13. Weighted Least Squares • The error term in the transformed model is no longer heteroskedastic • The transformed model is called a weighted least squares • Each observation is now weighted by the inverse of the standard deviation of the error • Major difficulty in estimating weighted least squares • Don’t observe • So we assume and transform model

  14. Detection of Heteroskedasticity Look for some association between the errors and some function of the explanatory variable(s) OLS estimates are constructed assuming there will be no correlation between the errors and the explanatory variables—suggesting we work with some function of the errors One approach is to regress ei2 on the explanatory variables and all combinations of the explanatory variables Looking to see if larger values of the error are associated with some function of the explanatory variables

  15. Detection of Heteroskedasticity Informal “test”: Graph the residuals ei versus predicted values of Y (Y-hat) ei versus each of the X’s (one at a time) look for patterns Formal tests White Breusch-Pagan Goldfeld-Quandt Park

  16. Detection of Heteroskedasticity Source: Gujarati, 5th edition

  17. Detection of Heteroskedasticity Source: Gujarati, 5th edition

  18. Detection of Heteroskedasticity

  19. Detection of Heteroskedasticity

  20. Breusch-Pagan test Estimate the model by OLS Obtain the squared residuals, Run 4. Do the whole model F-test, rejection indicates heteroskedasticity. Assumes Breusch, T.S. and A.R. Pagan (1979), “A Simple Test for Heteroskedasticity and Random Coefficient Variation,”Econometrica 50, pp. 987 - 1000.

  21. Breusch-Pagan test (not needing ) Estimate the model by OLS Obtain the squared residuals, Run keeping the R2 from this regression, call it Test statistic Rejection indicates heteroskedasticity. .

  22. Breusch-Pagan tests

  23. White test Estimate the model by OLS Obtain the squared residuals, Estimate 4. Do the whole model F-test, rejection indicates heteroskedasticity White, H. (1980), “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity,”Econometrica 48, pp. 817 - 838.

  24. White test Adds squares & cross products of all X’s Advantages no assumptions about the nature of the het. Disadvantages Rejection (a statistically significant White test statistic) may be caused by het or it may be due to specification error; it’s a nonconclusive test Number of covariates rises quickly so could also run since the predicted values are functions of the X’s (and the estimated parameters) and do F-test

  25. White test

  26. I know! We can save the model, but not until Eco205. Holy endogeneity, Batman! Violations of GM Assumptions Assumption Violation Wrong functional form Omit Relevant Variable (Include Irrelevant Var) Errors in Variables Sample selection bias, Simultaneity bias Heteroskedasticerrors Homoskedasticerrors No serial correlation of errors There exists serial correlation in errors Model is linear in parameters, the betas (4) i.i.d. sample of data (5)

  27. Violation of (3) • Error in period t is a function of error in prior period alone: first-order autocorrelation, denoted AR(1) for “autoregressive” process • Usual assumptions apply to new error term • is positive serial correlation • is negative serial correlation Nature of Serial Correlation

  28. Error in period t can be a function of error in more than one prior period • Second-order serial correlation • Higher orders generated analogously • Seasonally-based serial correlation Nature of Serial Correlation

  29. The error term in the regression captures • Measurement error • Omitted variables, that are uncorrelated with the included explanatory variables (hopefully) • Frequently factors omitted from the model are correlated over time • Persistence of shocks • Effects of random shocks (e.g., earthquake, war, labor strike) often carry over through more than one time period • Inertia • times series for GNP, (un)employment, output, prices, interest rates, etc. follow cycles, so that successive observations are related Causes of Serial Correlation

  30. 3. Lags • Past actions have a strong effect on current ones • Consumption last period predicts consumption this period • 4. Misspecified model, incorrect functional form • 5. Spatial serial correlation • In cross-sectional data on regions, a random shock in one region can cause the outcome of interest to change in adjacent regions • “Keeping up with the Joneses” Causes of Serial Correlation

  31. Consequences for OLS Estimates • Using an OLS estimator when the errors are autocorrelated results in unbiased estimators • However, the standard errors are estimated incorrectly • Whether the standard errors are overstated or understated depends on the nature of the autocorrelation • For positive AR(1), standard errors are too small! • Any hypothesis tests conducted could yield erroneous results • For positive AR(1), may conclude estimated coefficients ARE significantly different from 0 when we shouldn’t ! • OLS is no longer BLUE • A pattern exists in the errors • Suggesting an estimator that exploited this would be more efficient

  32. Graphical Detection of Serial Correlation

  33. Graphical no obvious pattern—the errors seem random. Sometimes, however, the errors follow a pattern—they are correlated across observations, creating a situation in which the observations are not independent with one another. Detection of Serial Correlation

  34. Detection of Serial Correlation Here the residuals do not seem random, but rather seem to follow a pattern.

  35. Detection: The Durbin-Watson Test • Provides a way to test H0:  = 0 • It is a test for the presence of first-order serial correlation • The alternative hypothesis can be •   0 •  > 0: positive serial correlation • Most likely alternative in economics •  < 0: negative serial correlation • DW Test statistic is d

  36. Detection: The Durbin-Watson Test • To test for positive serial correlation with the Durbin-Watson statistic, under the null we expect d to be near 2 • The smaller d, the more likely the alternative hypothesis The sampling distribution of d depends on the values of the explanatory variables. Since every problem has a different set of explanatory variables, Durbin and Watson derived upper and lower limits for the critical value of the test.

  37. Detection: The Durbin-Watson Test • Durbin and Watson derived upper and lower limits such that d1d* du • They developed the following decision rule

  38. Detection: The Durbin-Watson Test • To test for negative serial correlation the decision rule is • Can use a two-tailed test if there is no strong prior belief about whether there is positive or negative serial correlation—the decision rule is

  39. Serial Correlation Table of critical values for Durbin-Watson statistic (table E11, page 833 in BLK textbook) http://hadm.sph.sc.edu/courses/J716/Dw.html

  40. Serial Correlation Example What is the effect of the price of oil on the number of wells drilled in the U.S.?

  41. Serial Correlation Example What is the effect of the price of oil on the number of wells drilled in the U.S.?

  42. Serial Correlation Example Analyze residual plots … but be careful …

  43. Serial Correlation Example Remember what serial correlation is … • This plot only “works” if obs number is in same order as the unit of time

  44. Serial Correlation Example Same graph when plot versus “year” • Graphical evidence of serial correlation

  45. Serial Correlation Example Calculate DW test statistic Compare to critical value at chosen sig level dlower or dupper for 1 X-var & n = 62 not in table dlower for 1 X-var & n = 60 is 1.55, dupper = 1.62 • Since .192 < 1.55, reject H0:  = 0 in favor of H1:  > 0 at α=5%

More Related