1 / 39

Marietta College

Marietta College . Spring 2011 Econ 420: Applied Regression Analysis Dr. Jacqueline Khorassani. Week 14. Tuesday, April 12. Exam 3 : Monday, April 25, 12- 2:30PM Bring your laptops to class on Thursday too. Collect Asst 21. Use the data set FISH in Chapter 8 (P 274) to

aimon
Download Presentation

Marietta College

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Marietta College Spring 2011 Econ 420: Applied Regression Analysis Dr. Jacqueline Khorassani Week 14

  2. Tuesday, April 12 Exam 3: Monday, April 25, 12- 2:30PM Bring your laptops to class on Thursday too

  3. Collect Asst 21 Use the data set FISH in Chapter 8 (P 274) to run the following regression equation: F = f (PF, PB, Yd, P, N) • Conduct all 3 tests of imperfect multicollinearity problem and report your results. • If you find an evidence for imperfect multicollinearity problem, suggest and implement a reasonable solution.

  4. Use EViews • Open FISH in Chapter 8 • Run P = f (PF, PB, Yd, N) • Click on view on regression output • Click on actual, fitted, residual • Click on residual graph • Do you suspect the residuals to be autocorrealted?

  5. This is what you should have got Positive residual is followed by positive residual possible positive autocorrelation

  6. Causes of Impure Serial Correlation • Wrong functional form • Example: effect of age of the house on its price • Omitted variables • Example: not including wealth in the consumption equation • Data error

  7. Cause of Pure Serial Correlation • Lingering shock over time • War • Natural disaster • Stock market crash

  8. Consequences of Pure Autocorrelation • Unbiased estimates but wrong standard errors • In case of positive autocorrelation standard error of the estimated coefficients drops • Consequences on the t-test of significance?

  9. Consequences of Impure Autocorrelation • Biased estimates • Plus wrong standard errors

  10. Let’s look at first order serial correlation єt = ρєt-1 + ut ρ (row) is first order autocorrelation coefficient It takes a value between -1 to +1 u2 is a normally distributed error with the mean of zero and constant variance

  11. A Formal Test For First Order Autocorrelation • Durbin-Watson test • Estimate the regression equation • Save the residuals, e • Then calculate the Durbin -Watson Stat (d stat) • d stat ~ 2 (1- ρ) • What is dstat under perfect positive correlation? ρ = +1  d = 0 • What is dstat under perfect negative correlation? ρ = -1  d = 4 • What is dstat under no autocorrelation? ρ = 0  d = 2 • What is the range of values for dstat? 0 to 4

  12. If 2>dstat>0 then suspect (test for) positive autocorrelation If 4>dstat>2 then suspect (test for) negative autocorrelation dstat=0 Perfect positive autocorrelation dstat=4 Perfect negative autocorrelation dstat=2 No autocorrelation

  13. EViews calculates d-stat automatically • It is included in your regression output • Run P = f (PF, PB, Yd, N) • Do you see the d-stat?

  14. What type of serial correlation shall we test for? Positive Dependent Variable: P Method: Least Squares Date: 04/12/11 Time: 08:59 Sample: 1946 1970 Included observations: 25 Variable Coefficient Std. Error t-Statistic Prob.   C -2.083188 0.271658 -7.668417 0.0000 PF 0.027143 0.017355 1.563934 0.1335 PB -0.012571 0.011620 -1.081865 0.2922 YD 0.001597 0.000387 4.132263 0.0005 N - 5.54E-05 1.27E-05 -4.376214 0.0003 R-squared 0.801154     Mean dependent var 0.160000 Adjusted R-squared 0.761384     S.D. dependent var 0.374166 S.E. of regression 0.182774     Akaike info criterion -0.384281 Sum squared resid 0.668123     Schwarz criterion -0.140506 Log likelihood 9.803514     Hannan-Quinn criter. -0.316668 F-statistic 20.14505     Durbin-Watson stat 1.498086 Prob(F-statistic) 0.000001

  15. If d stat<2, test for positive autocorrelation. • Null and alternative hypotheses • H0: ρ≤0 (no positive auto) • HA: ρ>0 (positive auto) • Choose the level of significance (say 5%) • Critical dstat (PP 591- 593) • Decision rule • If dstat< dL reject H0  there is significant positive first order autocorrelation • If dstat> dU  don’t reject H0  there is no evidence of a significant autocorrelation • if dstat is between dL and du  the test is inconclusive.

  16. N = 25, K = 4 At 5% level dL= 1.04, dU=1.77 dstat is between dL and du  the test is inconclusive Dependent Variable: P Method: Least Squares Date: 04/12/11 Time: 08:59 Sample: 1946 1970 Included observations: 25 Variable Coefficient Std. Error t-Statistic Prob.   C -2.083188 0.271658 -7.668417 0.0000 PF 0.027143 0.017355 1.563934 0.1335 PB -0.012571 0.011620 -1.081865 0.2922 YD 0.001597 0.000387 4.132263 0.0005 N - 5.54E-05 1.27E-05 -4.376214 0.0003 R-squared 0.801154     Mean dependent var 0.160000 Adjusted R-squared 0.761384     S.D. dependent var 0.374166 S.E. of regression 0.182774     Akaike info criterion -0.384281 Sum squared resid 0.668123     Schwarz criterion -0.140506 Log likelihood 9.803514     Hannan-Quinn criter. -0.316668 F-statistic 20.14505     Durbin-Watson stat 1.498086 Prob(F-statistic) 0.000001

  17. H0: ρ≤0 (no positive auto) • HA: ρ>0 (positive auto) • level of significance = 5% • Critical d-stat • dL =1.04 • dU = 1.77 • Decision • dstat is between dL and du  the test is inconclusive Fail to reject H0 Reject H0 inconclusive 1.5 DWstat=0 Perfect positive autocorrelation 1.04 1.77 DWstat=4 Perfect negative autocorrelation DWstat=2 No autocorrelation

  18. If dstat >2, you will to test for negative autocorrelation. • Null and alternative hypotheses • H0: ρ≥0 (no negative auto) • HA: ρ<0 (negative auto) • Choose the level of significance (1% or 5%) • Critical dstat (page 591- 593) • Decision rule • If dstat>4-dL reject H0  there is significant negative first order autocorrelation • If dstat< 4-dU  don’t reject H0  there is no evidence of a significant autocorrelation • if dstat is between 4 – dL and 4 – du  the test is inconclusive.

  19. Example d-sta >2  test for negative autocorrelation Dependent Variable: CONSUMPTION Method: Least Squares Date: 11/09/08 Time: 20:11 Sample: 1 30 Included observations: 30 Variable Coefficient Std. Error t-Statistic Prob.   C 16222.97 5436.061 2.984324 0.0060 INCOME 0.641166 0.166878 3.842131 0.0007 WEALTH 0.148788 0.041327 3.600281 0.0013 R-squared 0.847738      Mean dependent var 52347.37 Adjusted R-squared 0.836459      S.D. dependent var31306.54 S.E. of regression 12660.43Akaike info criterion 21.82499 Sum squared resid 4.33E+09   Schwarz criterion 21.96511 Log likelihood -324.3748  Hannan-Quinn criter. 21.86982 F-statistic 75.16274      Durbin-Watson stat 2.211726 Prob(F-statistic) 0.000000

  20. Let’s test for autocorrelation at 1% level in our example H0: ρ≥0 (no negative auto) HA: ρ<0 (negative auto) • 1% level of significance, k=2, n=30 • dL=1.07, du= 1.34 • 4- dL=2.93, 4- du= 2.66 • dstat <4- du, don’t reject H0

  21. Asst 22: Due Thursday • Use the data on Soviet Defense spending (Page 335– Data set: DEFEND Chapter 9) to regress SDH on SDL, UDS and NR only. • Conduct a Durbin-Watson test for serial correlation at 5% level of significance • If you find an evidence for autocorrelation, is it more likely to be pure or impure autocorrelation? Why?

  22. Thursday April 15 • Exam 3: Monday, April 25, 12- 2:30PM • Bring your laptops to class next Tuesday

  23. Collect Asst 22 • Use the data on Soviet Defense spending (Page 335– Data set: DEFEND Chapter 9) to regress SDH on SDL, USD and NR only. • Conduct a Durbin-Watson test for serial correlation at 5% level of significance • If you find an evidence for autocorrelation, is it more likely to be pure or impure autocorrelation? Why?

  24. Solutions for Autocorrelation Problem • If the D-W test indicates autocorrelation problem • What should you do?

  25. Adjust the functional form • Sometimes autocorrelation is because we use a linear form while we should have used a non-linear form With a linear line, errors have formed a pattern The first 3 observations have positive errors The last 2 observations have negative errors Revenue curve is not linear (It is bell shaped) What should we use? revenue 3 * * 4 2 * 5 * 1 * Price

  26. 2. Add other relevant (missing) variables We forget to include wealth in our model In year one (obs. 1) wealth goes up drastically big positive error The effect of the increase in wealth in year 1 lingers for 3 years Errors form a pattern We should include wealth in our model • Sometimes autocorrelation is caused by omitted variables. consumption * 5 4 3 * * 2 * * 1 Income

  27. 3. Examine the data • Any systematic error in the collection or recording of data may result in autocorrelation.

  28. After you make adjustments 1, 2 and 3 • Test for autocorrelation again • If autocorrelation is still a problem then suspect pure autocorrelation • Follow the Cochrane-Orcutt procedure • Say what?????

  29. Suppose our model is Yt = β0 + β1Xt + єt(1) And the error terms in Equation 1 are correlated єt = ρє t-1 + ut(2) Where ut is not auto-correlated. Rearranging 2 we get 3 єt - ρє t-1 = ut(3) Let’s lag Equation 1 Yt-1 = β0 + β1Xt-1 + єt-1 (4)

  30. Now multiply Equation 4 by ρ ρ Yt-1 = ρ β0 + ρβ1Xt-1 + ρ єt-1 (5) Now subtract 5 from 1 to get 6 Yt = β0 + β1 Xt + єt - ρ Yt-1 = - ( ρ β0 + ρβ1Xt-1 + ρ єt-1) ___________________________________ Yt - ρ Yt-1 = β0 - ρ β0 + β1 Xt - ρβ1Xt-1 + єt - ρ єt-1 (6) Note that the last two terms in Equation 6 are equal to Ut So 6 becomes Yt - ρ Yt-1 = β0 - ρ β0 + β1 (Xt - ρ Xt-1 ) + ut (7)

  31. Yt - ρ Yt-1 =β0- ρ β0 + β1 (Xt - ρ Xt-1 ) + ut (7) • What is so special about the error term in Equation 7? It is not auto-correlated • So, instead of equation 1 we can estimate equation 7 Define Zt = Yt – ρYt-1 & Wt = Xt – ρXt-1 Then 7 becomes Zt= M + β1 Wt + ut(8) Where M is a constant = β0 (1-ρ) Notice that the slope coefficient of Equation 8 is the same as the slope coefficient of our original equation 1.

  32. The Cochrane-Orcutt Method:So our job will be Step 1: Apply OLS to the original model (Equation 1) and find the residuals et Step 2: Use ets to estimate Equation 2 and find ρ^ (Note: this equation does not have an intercept.) Step 3: Multiply ρ^ by Yt-1 and Xt-1 & find Zt & Wt Step 4: Estimate Equation 8

  33. Luckily • EViews does this (steps 1- 4) automatically • All you need to do is to add AR(1) to the set of your independent variables. • The estimated coefficient of AR(1) is ρ^ • Let’s apply this procedure to Asst 22

  34. Dependent Variable: SDH Variable Coefficient Std. Error t-Statistic Prob.   C -9.11 8.4 -1.08 0.2940 • SDL 1.38 0.17 8.10 0.0000 • USD 6.71E-05 0.013 0.005 0.9959 • NR 0.0005 0.0004 1.46 0.1608 • AR(1) 0.82 0.10 8.002 0.0000 • R-squared 0.997927      • Adjusted R-squared 0.997490      • Durbin-Watson stat 2.463339 Dependent Variable: SDH Variable Coefficient Std. Error t-Statistic Prob.   C 8.83 2.50 3.52 0.0020 SDL 0.97 0.04 22.18 0.0000 USD -0.005 0.008 -0.60 0.5553 NR 0.002 0.0002 9.30 0.0000 R-squared 0.996792      Adjusted R-squared 0.996334    Durbin-Watson stat 1.076364 What happened to standard errors as we corrected for serial correlation? They went up Positive autocorrelation standard error What is this? It is ρ^

  35. Return and discuss Asst 21 Use the data set FISH in Chapter 8 (P 274) to run the following regression equation: F = f (PF, PB, Yd, P, N) • Conduct all 3 tests of imperfect multicollinearity problem and report your results. • If you find an evidence for imperfect multicollinearity problem, suggest and implement a reasonable solution.

  36. Correlation Matrix First test PF is more correlated with PB than with F  PF is a problem Yd is more correlated with PB and PF than with F Yd is a problem N is more correlated with PB, PF and Yd than with F N is a problem PB is more correlated with PF than with F  PB is a problem P is more correlated with everything else than with F P is a problem F P PB PF YD N F 1 0.58 0.82 0.85 0.79 0.74 P 1 0.66 0.73 0.78 0.57 PB 1 0.96 0.82 0.78 PF 1 0.92 0.88 YD 1 0.93 N 1

  37. Correlation Matrix • Second test: • problem areas: • PF and PB • PF and Yd • PF and N • PB and Yd • Yd and N F P PB PF YD N F 1 0.58 0.82 0.85 0.79 0.74 P 1 0.66 0.73 0.78 0.57 PB 1 0.96 0.82 0.78 PF 1 0.920.88 YD 1 0.93 N 1 Note: F being highly correlated with independent variables is a good thing not a bad thing

  38. Test 3 • Need 5 regression equations • PF = f (P, Yd, PB, N) • P = f (PF, Yd, PB, N) • Yd = f (P, PF, PB, N) • PB = f (PF, Yd, P, N) • N = f (PF, Yd, PB, P) • For all find R2 then find VIF • For all VIF>5  Each independent variable is highly correlated with the rest

  39. Solutions • Increase sample size • Note: we want at least a df= 30, we have df=19 • Do we have an irrelevant variable? • Seth argued N is not needed? • What is N? (P 273) • Seth, what was your argument? • Generate a new variable that measures the ratio of prices • Makes sense but doesn’t solve the high correlation between Yd and N • Note: make sure your transformed variable makes sense • That is the estimated coefficient has a meaning that people can understand • The ratio PF/Yd makes no sense

More Related