150 likes | 443 Views
Econometric Analysis. Week 7 Instrumental variables estimation and simultaneous equation models. Endogenous explanatory variables (regressors) and bias other sources of bias Instrumental Variables estimation valid instruments
E N D
Econometric Analysis Week 7 Instrumental variables estimation and simultaneous equation models
Endogenous explanatory variables (regressors) and bias • other sources of bias • Instrumental Variables estimation • valid instruments • IV estimation – manually as Two Stage Least Squares and automatically in PcGive • Two practical examples Lecture outline
In a single equation regression model we require E(u|X) = E(u) (which is = 0 by assumption) u must be independent of each X. • If this condition is not satisfied then OLS estimators will be biased (proof on next two slides). • There are several ways that this assumption might be violated: • endogenous regressors (the equation is part of a simultaneous equation model and one of the regressors is endogenous (jointly determined) – simultaneous equation bias • there is an omitted variable that is correlated with one of the included variables • one or more of the X variables has systematic measurement errors such that the observed values are not independent of the disturbance Bias
Proof continued But if E(u|X) 0 then bias =
Instrumental variables estimation can be used to obtain estimators that, although still biased, are consistent • We look for an instrumental variable, Z, that satisfies two conditions • Exogeneity E(u|Z) = 0 • Relevance corr(Z,X) 0 • Suppose for now that we have found such a variable. We can estimate our model in two stages. • First estimate the model Xi = 0 + 1Zi + vi • Now use the fitted values in the original equation instead of X. In that equation the part of X that is linked to the disturbance term will have been removed. • IV estimation is asymptotically unbiased (bias disappears as n tends to infinity) Instrumental variables
Simple example I am using the simple example from Dougherty’s book, p9288-290 He has cross-section data for C = consumption per capita, I = gross investment per capita and Y = GDP per capita (all in US$, 1998) for 33 countries as follows: Australia, Austria, Belgium, Canada, China_PR, China_HK, Denmark, Finland, France, Germany, Greece, Iceland, India, Indonesia, Ireland, Italy, Japan, South Korea, Luxembourg, Malaysia, Mexico, Netherlands, New Zealand, Norway, Pakistan, Philippines, Portugal, Spain, Sweden, Switzerland, Thailand, UK, USA. He first estimates a simple linear consumption function of the form C = a + bY by OLS. He then uses Investment as an Instrumental Variable in a reduced form equation to create the reduced form prediction of Y (=Yhat) in the first stage of a 2 stage process to estimate C as a function of Yhat. I have attempted to replicate this in PcGive – my results are slightly different for both regressions – maybe there is a misprint in the table printed in Dougherty p 289?
First the simple OLS estimation of the consumption function EQ( 1) Modelling C by OLS-CS (using dougherty_p289) The estimation sample is: 1 to 33 Coefficient Std.Error t-value t-prob Part.R^2 Constant 369.173 452.2 0.816 0.420 0.0210 Y 0.731595 0.02040 35.9 0.000 0.9765 sigma 1415.64 RSS 62125252.9 R^2 0.976466 F(1,31) = 1286 [0.000]** log-likelihood -285.22 DW 1.38 no. of observations 33 no. of parameters 2 mean(C) 13966.2 var(C) 7.99929e+007 Note: These results are slightly different to those shown in Dougherty - a misprint in his table? Next I get the 2SLS results using I as an instrument to obtain Y hat. First – manually, in two separate stages - running the reduced form regression, saving Yhat and then regressing C on Yhat. Then automatically in PcGive.
Reduced form equation for Y EQ( 2) Modelling Y by OLS-CS (using dougherty_p289) The estimation sample is: 1 to 33 Coefficient Std.Error t-value t-prob Part.R^2 Constant 1219.40 982.8 1.24 0.224 0.0473 I 4.20140 0.1973 21.3 0.000 0.9360 sigma 3152.3 RSS 308047016 R^2 0.936036 F(1,31) = 453.6 [0.000]** log-likelihood -311.638 DW 1.71 no. of observations 33 no. of parameters 2 mean(Y) 18585.4 var(Y) 1.45937e+008 fitted [1 to 33] saved to dougherty_p289 – renamed Yhat EQ( 3) Modelling C by OLS-CS (using dougherty_p289) The estimation sample is: 1 to 33 Coefficient Std.Error t-value t-prob Part.R^2 Constant 595.823 1029. 0.579 0.567 0.0107 Yhat 0.719400 0.04685 15.4 0.000 0.8838 sigma 3145.75 RSS 306768831 R^2 0.883789 F(1,31) = 235.8 [0.000]** log-likelihood -311.569 DW 1.52 no. of observations 33 no. of parameters 2 mean(C) 13966.2 var(C) 7.99929e+007
Here are the results from using the 2SLS procedure in PcGive EQ( 4) Modelling C by IVE-CS (using dougherty_p289) The estimation sample is: 1 to 33 Coefficient Std.Error t-value t-prob Y Y 0.719400 0.02121 33.9 0.000 Constant 595.823 465.6 1.28 0.210 sigma 1423.78 RSS 62841477.8 Reduced form sigma 3145.8 no. of observations 33 no. of parameters 2 no. endogenous variables 2 no. of instruments 2 mean(C) 13966.2 var(C) 7.99929e+007 Additional instruments:[0] = I
PcGive automates the two stage least squares (instrumental variables) estimation procedure – and provides appropriate estimates of the second stage standard errors • The screen grab below shows how I formulated the IV estimation of my equation. The first endogenous variable listed is used on the LHS (i.e. C). The second endogenous variable (Y) is fitted to the “reduced form” equation with I marked as an instrument. Instrumental variables estimation in PcGive
Wooldridge, J M (2006) Introductory Econometrics. A Modern Approach. Chapters 15 and 16 • Stock. J H and Watson, M (2007) Introduction to Econometrics (Second Edition) Chapter 12 • Kennedy, P (2003) A Guide to Econometrics. pp157-163 • Dougherty, C (2006) Introduction to Econometrics – see pp288-290 • Doornik, J A and Hendry, D F (2006) Empirical Econometric Modelling PcGive Vol 1. pp 63-66, 165-166, 241-242 References and recommended reading