1 / 17

Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure

Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure. Revisit Omitted Explanatory Variable Bias. Review of Our Previous Explanation of Omitted Explanatory Variable Bias.

linore
Download Presentation

Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure Revisit Omitted Explanatory Variable Bias Review of Our Previous Explanation of Omitted Explanatory Variable Bias Omitted Explanatory Variable Bias and the Explanatory Variable/Error Term Independence Premise The Ordinary Least Squares Estimation Procedure, Omitted Explanatory Variable Bias, and Consistency Instrumental Variable (IV) Estimation Procedure: A Two Regression Procedure The Mechanics The Two “Good” Instrument Conditions Omitted Explanatory Variables Example: 2008 Presidential Election Instrument Variable (IV) Application: 2008 Presidential Election The Mechanics The Two “Good” Instrument Conditions Revisited Justifying the Instrumental variable (IV) Estimation Procedure

  2. Revisit Omitted Variables Phenomenon Review: Omitting an explanatory variable from a regression will bias the OLS estimation procedure whenever two conditions are met. Bias results if the omitted explanatory variable: influences the dependent variable; is correlated with an included explanatory variable. When these two conditions are met, the coefficient estimate of the included variable is a composite of two effects; the estimate of the included variable reflects the influence that the Direct Effect:The effect that the included explanatory variable actually has on the dependent variable. This is the effect we want regression analysis to isolate. Proxy Effect:The effect that the omitted explanatory variable has on the dependent variable because the included variable is acting as a proxy for the omitted explanatory variable. Model:yt = Const + x1x1t + x2x2t + et To illustrate the effect of omitted variables, assume that Question: What happens when we omit the explanatory variable x2 from the regression? x1 > 0 and x2 > 0 x1t and x2t are positively correlated. Effect of omitting the explanatory variable x2: Intuitive Approach Proxy effect causes upward bias. x1t Up  x1 and x2 positively correlated  Typically, x2t Up  yt Up  yt Up x2 > 0 x1 > 0 The direct effect is what we want the coefficient estimate to capture, the influence that x1t has on yt. Direct Effect Proxy Effect We will now see how we can approach the omitted variable phenomenon in a different way.

  3. Regression Model yt = Dependent variable xt = Explanatory variable et = Error term yt = Const + xxt + et Const and x are the parameters t = 1, 2, …, T The error term is a random variable representing random influences: Mean[et] = 0 Standard Ordinary Least Squares (OLS) Premises Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same. Error Term/Error Term Independence Premise: The error terms are independent. Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. OLS Estimation Procedure Includes Three Estimation Procedures Value of the parameters,Const and x: bx = bConst = Claim: The omitted variable phenomenon violates the explanatory variable/error term independence premise. SSR Variance of the error term’s probability distribution,Var[e]: EstVar[e] = Degrees of Freedom Variance of the coefficient estimate’s probability distribution,Var[bx]: EstVar[bx] = Good News: When the standard premises are satisfied each of these procedures is unbiased. Good News: When the standard premises are satisfied the OLS estimation procedure for the coefficient value is the best linear unbiased estimation procedure (BLUE). Crucial Point: When the ordinary least squares (OLS) estimation procedure performs its calculations, it implicitly assumes that the three standard (OLS) premises are satisfied.

  4. Effect of omitting the explanatory variable x2: Error term approach yt = Const + x1x1t + x2x2t + et x1 > 0 and x2 > 0 = Const + x1x1t + (x2x2t + et) x1t and x2t are positively correlated. = Const + x1x1t + t when x2t is omitted, the error term becomes:t = x2x2t + et x1t Up  x1 and x2 positively correlated  Typically, x2t Up t = x2x2t + et and x2 > 0 t Up x1t is positively correlated with t OLS estimation procedure for the value of x1’s coefficient is biased upward Question: From before we know that the ordinary least squares (OLS) estimation procedure for the included coefficient’s value is biased in this case, but might the ordinary least squares (OLS) estimation procedure be consistent?

  5. Review: Unbiased and Consistent Estimation Procedures Unbiased: Small Sample Property. The estimation procedure does not systematically underestimate or overestimate the actual value. Formally, the mean of the estimate’s probability distribution equals the actual value. Mean[Est] = Actual Value When the estimate’s probability distribution is symmetric, the chances that the estimate is greater than the actual value equal the chances that it is less. Unbiasedness is called a small sample property because it does not depend on the sample size. Unbiasedness depends only of the mean of the estimate’s probability distribution. Consistent: Large Sample Property. Both the mean and variance of the estimate’s probability distribution are important for consistency: Mean of the estimate’s probability distribution: Either The estimation procedure is unbiased: Mean[Est] = Actual Value or The estimation procedure is biased, but the magnitude of the bias diminishes as the sample size becomes larger. Formally, as the sample size approaches infinity the mean approaches the actual value: As Sample Size : Mean[Est]  Actual Value Variance of the estimate’s probability distribution: The variance diminishes as the sample size becomes larger Formally, as the sample size approaches infinity the variance approaches 0: As Sample Size : Variance[Est]  0

  6. Question: Might the ordinary least squares (OLS) estimation procedure be consistent? Estimation Corr Actual Sample Actual Mean of Magnitude Variance of Procedure X1&X2 Coef2 Size Coef Coef1 Ests of Bias Coef1 Ests OLS .60 4.0 50 2.0 OLS .60 4.0 100 2.0 OLS .60 4.0 150 2.0 4.4 2.4 6.6 4.4 2.4 3.2 4.4 2.4 2.1 Question: Is the OLS estimation procedure unbiased? No. Lab 20.1 Question: Is the OLS estimation procedure consistent? No. Summary: When an explanatory variable is omitted that influences the dependent variable Question: Why is there a problem? is correlated with an included explanatory variable the ordinary least squares (OLS) estimation procedure for the value of the included variable’s coefficient is: yt = Const+ x1x1t + x2x2t + et yt= Const+ x1x1t + t t = x2x2t + et biased. Whenx1t and tare correlated not consistent.  x1t is a “problem” explanatory variable “Problem” Explanatory Variable:xt is the “problem” explanatory variable: The explanatory variable, x1t, is correlated with the error term, t. Consequently, the explanatory variable/error term independence premise is violated. The ordinary least squares (OLS) estimation procedure for the coefficient value is biased.

  7. Addressing the “Problem” Explanatory Variable Using Instrumental Variables Choose an Instrument: A “good” instrument, zt, must meet two conditions. The instrument, zt, must be: Correlated with the “problem” explanatory variable, xt. Uncorrelated with the error term, t. Instrumental Variables (IV) Regression 1: Use the instrument, zt, to provide an “estimate” of the problem explanatory variable, xt. Dependent Variable:“Problem” explanatory variable, xt. Explanatory Variable: Instrument, zt. Estimate of the problem explanatory variable: Estxt = aConst + azzt where aConst and az are the estimates of the constant and coefficient in this regression, IV Regression 1. Instrumental Variables (IV) Regression 2:In the original model, replace the “problem” explanatory variable, xt, with its surrogate, Estxt, the estimate of the “problem” explanatory variable provided by the instrument, zt, from IV Regression 1. Dependent Variable: Original dependent variable, yt. Explanatory Variable:Estimate of the “problem” explanatory variable based on the results from IV Regression 1, Estxt.

  8. Justifying the Instrument Conditions Instrument/”Problem” Explanatory Variable Correlation: The instrument, zt, must be correlated with the “problem” explanatory variable, xt. Focus on Instrumental Variables (IV) Regression 1: Use the instrument, zt, to provide an estimate of the “problem” explanatory variable, xt. Dependent Variable: “Problem” Explanatory Variable, xt. Explanatory Variable: Instrument, zt We are using the instrument to create a surrogate for the “problem” explanatory variable: Estxt = aConst + azzt The estimate, Estxt, will be a “good” surrogate only if the instrument is correlated with the problem explanatory variable. This will occur only if the instrument, zt, is correlated with the “problem” explanatory variable, xt.

  9. Instrument/Error Term Independence: The instrument, zt, must be independent of the error term, t. Focus on Instrumental Variables (IV) Regression 2 In the original model, replace the “problem” explanatory variable, xt, with its surrogate, Estxt, the estimate of the “problem” explanatory variable provided by the instrument, zt, from IV Regression 1. Original Model: yt = Const+ xxt + t Replace the “problem” explanatory variable with its surrogate yt = Const+ xEstxt + t Estxt = aConst + azzt zt and t must be independent. To avoid violating the explanatory variable/error term independence premise  Estxt and t must be independent

  10. Instrumental Variable Example: 2008 Presidential Election AdvDegtPercent adults who have advanced degrees in state t ColltPercent adults who graduated from college in state t HStPercent adults who graduated from high school in state t PopDentPopulation density of state t (persons per square mile) VoteDemPartyTwotPercent of the vote received in 2008 received by the Democratic party in state t based on the two major parties (percent) Model: VoteDemPartyTwot = Const + PopDenPopDent + LibLiberal+ et Population Density Theory: States with high population densities have large urban areas which are more likely to vote for the Democratic candidate, Obama. PopDen > 0 “Liberalness” Theory: Since the Democratic party is more liberal than the Republican party, a high “liberalness” value would increase the vote of the Democratic candidate, Obama. Lib > 0 But we have no data for a state’s “liberalness”; hence, we have no choice – we must omit it. EViews

  11. Question: Might there be a serious econometric problem when Liberal is omitted? Model: VoteDemPartyTwot= Const + PopDenPopDent + LibLiberal+ et =Const + PopDenPopDent + (LibLiberal+ et) =Const + PopDenPopDent+ t where t = LibLiberal + et PopDent and Liberaltpositively correlated Typically, Liberalt Up PopDent Up t = LibLiberal + et and Lib> 0 t Up PopDent is the “problem” explanatory variable. PopDent is positively correlated with t OLS estimation procedure for PopDen’s coefficient is biased Summary: When Liberal is omitted PopDen is a “problem” explanatory variable. It is correlated with the error term, t. The explanatory variable/error term independence premise is violated.

  12. Applying the Instrumental Variable (IV) Approach Choose an Instrument: We shall choose Collt as our instrument. By doing so, we believe that it satisfies the two “good” instrument conditions; that is, we believe that the percent of college graduates, Collt, is: positively correlated with the “problem” explanatory variable, PopDent uncorrelated with the error term, t = LibLiberalt + et. Consequently, we believe that the instrument , Collt, is uncorrelated with the omitted variable, Liberalt. EViews Instrumental Variables (IV) Regression 1 Dependent variable: “Problem” explanatory variable, PopDen Explanatory variable: Instrument, Coll. Estimated Equation: EstPopDen = 3,676.3 + 149.3Coll

  13. Instrumental Variables (IV) Regression 2 Dependent Variable: Original dependent variable, VoteDemPartyTwo. Explanatory Variable:Estimate of the “problem” explanatory variable based on the results from Regression 1, EstPopDen. EstPopDen = 3,676.3 + 149.3Coll EViews Estimated Equation: VoteDemPartyTwo = 48.5 + .01EstPopDen

  14. Good Instrument Conditions Revisited Instrument/”Problem” Explanatory Variable Correlation: The instrument, Collt, must be correlated with the “problem” explanatory variable, PopDent . Their correlation coefficient equals .60 Instrumental Variables (IV) Regression 1 The coefficient estimate is significant at the 1 percent level. Consequently, it is not unreasonable to judge that the instrument meets the first condition.

  15. Instrument/Error Term Independence: The instrument, Collt, and the error term, t, must be independent. VoteDemPartyTwot = Const + PopDenEstPopDent + t Question: Are EstPopDent and tindependent? t = LibLiberal + et EstPopDen = 3,676.3 + 149.3Coll Answer: Only if Collt and Liberalt are independent. The explanatory variable/error term independence premise will be satisfied only if the instrument, Collt, and the omitted variable, Liberalt, are independent. Unfortunately, there is no way to confirm this empirically with our data. This can be the “Achilles heel” of the instrumental variable (IV) estimation procedure.

  16. Justifying the Instrumental Variable (IV) Approach: A Simulation Model: yt = Const + x1x1t + x2x2t + et Defaults The Only X1 button is selected. Hence, second explanatory variable, x2, is omitted. yt = Const + x1x1t + (x2x2t + et) = Const + x1x1t + t where t = x2x2t + et The actual coefficient for the omitted explanatory variable, x2t, equals 4. Corr X1&Z Corr X2&Z Corr X1&X2 .50 .75 .00 .10 .00 .30 .60 In the CorrX1&Z list .50 is selected. Hence, the included explanatory variable and the instrument are correlated. In the CorrX2&Z list .00 is selected. Hence, the instrument and the omitted variable are uncorrelated; consequently, the instrument and the error term are independent. In the CorrX1&X2 list .60 is selected. Hence, the included and omitted explanatory variables are correlated.

  17. Estimation Correlation Coefficients Sample Actual Mean of Magnitude Var of Procedure X1&Z X2&Z X1&X2 Size Coef1 Coef1 Ests of Bias Coef1 Ests IV .50 .00 .60 50 2.0 IV .50 .00 .60 100 2.0 IV .50 .00 .60 150 2.0 1.82 .18 32.8 1.89 .11 13.7 1.95 .05 9.2 Lab 20.2 Question: Is the IV estimation procedure: Unbiased? No. Consistent? Yes. Instrument/”Problem” Explanatory Variable Correlation: Suppose that the instrument is more highly correlated with the “problem” explanatory variable: Estimation Correlation Coefficients Sample Actual Mean of Magnitude Var of Procedure X1&Z X2&Z X1&X2 Size Coef1 Coef1 Ests of Bias Coef1 Ests IV .75 .00 .60 150 2.0 1.97 .03 3.9 Question: Is the instrument better? Yes. Both the magnitude of the bias and the variance of the estimates is less. Lab 20.2 Instrument/Error Term Independence: Suppose that the instrument is correlated with the error term? Estimation Correlation Coefficients Sample Actual Mean of Magnitude Var of Procedure X1&Z X2&Z X1&X2 Size Coef1 Coef1 Ests of Bias Coef1 Ests IV .50 .10 .60 50 2.0 IV .50 .10 .60 100 2.0 IV .50 .10 .60 150 2.0 2.67 .67 31.1 2.70 .70 13.8 2.73 .73 8.7 Lab 20.3 Question: Is the IV estimation procedure: Unbiased? No. Consistent? No.

More Related