210 likes | 258 Views
Instrumental Variables: Introduction. Methods of Economic Investigation Lecture 14. Last Time. Review of Causal Effects Defining our types of estimates: ATE (hypothetical) TOT (can get this if SB = 0) ITT (use this if we’ve got compliance problems) Methods
E N D
Instrumental Variables: Introduction Methods of Economic Investigation Lecture 14
Last Time • Review of Causal Effects • Defining our types of estimates: • ATE (hypothetical) • TOT (can get this if SB = 0) • ITT (use this if we’ve got compliance problems) • Methods • Experiment (gold standard but can’t always get it) • Fixed Effects (assumption on within group variation) • Difference-in-Differences (assumption on parallel trends) • Propensity Score Matching (assumption on relationship between observables, unobseravables, and treatment)
Today’s Class • Introduction to Instrumental Variables • What are they • How do we estimate IV • Tests for specification/fit
Recap of the problem • There is some part of the error that we don’t observe (maybe behavioral parameters, maybe simultaneously determined component, etc.) • This component might not be: • Fixed within a group • Fixed over time/space • Related to observables • BUT…this component IS correlated with the treatment/variable of interest
Our Treatment Effects Model • Consider the following model to estimate the effect of treatment S on some outcome Y: Y = αX + ρS + η • Our Treatment here is S • Think of the example of schooling • How much more will you earn if you go to college? • Can’t observe true underlying ability which is correlated with college attendance decision and future earnings
What’s correlated and what’s not • The model we want to estimate: Yi = αX + ρsi +γAi + vi • We have that: • E[sv] = 0 (by assumption) • E[Av] = 0 (by construction) • The idea: if A could be observed, we’d just include it in the regression and be done
The Instrument…. Not Assigned Treatment (S=0) Assigned to Treatment (S=1) A B AH =1 AH =1 AL =0 AL =0 ITT compares all of A to all of B: this mixes up the compliers (AH=S=1; AL=S=0) and the non-compliers (AH=1, S=0; AL=0, S=1)
Introducing Instruments • The problem: How to estimate ρ when • A is not observed • A is related to Y • Cov(AS) ≠0 • The solution: find something that is • Correlated with S [“Monotonicity”] • Uncorrelated with any other determinant of the outcome variable Y [“Exclusion Restriction”]
How does IV work • Call our instrument z • Our two instrument characteristics can be re-written as • E[zS] ≠0 • E[z η] = 0 • Then from our equations we can write our population estimate of ρ as:
The Instrument…. Not Assigned Treatment (S=0) Assigned to Treatment (S=1) A B AH =1 AH =1 AL =0 AL =0 Using the Instrument, we can determine where the partition is: then we can compare the part of A which was “randomly assigned (AH=S=1) to the part of B that is randomly assigned (AL=S=0)
Simplest case for IV • Homogeneous treatment effects (same ρ for all i ) • Dummy Variable for instrument • z= 1 with probability q • Can break-up continuous instruments into sets of dummy variables or use GLS to generalize • For now—don’t worry about covariates • Simple extension: just include these in both stages • Simplify our notation later…
Return to LATE • Using z as a dummy that’s 1 with probability q • Cov(Y, z) = {E[Y | z = 1] – E[Y | z = 0]}q(1 – q) • Cov(s, z) = {E[s | z = 1] – E[s | z = 0]}q(1 – q) • Can rewrite ρas: • Should look familiar: it’s our LATE estimate
Another type of intuition • Remember that E[η | S] ≠ 0 (that’s why we’re in this mess) • E[Y | S] ≠ ρE[S] • Can condition on Z, rather than S • By the “exclusion restriction” property of our instrument E[η | Z] = 0 • So now can estimate ρ because E[Y | z] = ρE[S | z] • If Z is binary, then this simplifies to our Wald estimator
IV estimate intuition • The only reason for a relationship between z and Y is the relationship between z and X • In dummy variable specification: this is just rescaling the reduced form difference in means (E[Y | z=1] – E[Y| z=0]) by the first stage difference in means (E[S | z=1] – E[S| z=0])
How does IV work: Regression Intuition • To see why this is, Think about our “structural equations” Yi = αX + ρsi + ηi • We can estimate ρ by getting the ratio of two different coefficients • First stage: si = π10X + π11zi + ξ1i • Reduced form: yi = π20X + π21zi + ξ2i Endogenous Exogenous instrument Exogenous Covariates
Rewriting the Structural equation Plug in the values from the first stage: Yi = αX + ρsi + ηi = αX + ρ[π10X + π11zi + ξ1i] + ηi = [α + ρπ10]X + ρ π11zi + ρ ξ1i+ ηi = π20X + π21zi + ξ2i = αX + ρ[π10X + π11zi] + + ξ2i Fitted value in the population regression of s on z (and X) Coefficient population regression of y on s, and also on the fitted value of S (and the X’s)
Population vs. Estimates • If we had the entire population, we could measure the relationship between z and S and obtain the true π’s • Using these π’s we could then obtain the true ρ • Unfortunately, most of the time, we have finite samples
Estimating 2SLS • In practice, use finite samples to obtain fitted value • Consistent estimate of parameters from OLS • Use these parameters to construct fitted value • Then use this fitted value to construct second stage estimating equation • Can get consistent estimates because covariates and fitted values are • independent of η (by assumption) • Independent of (by construction)
Bias in 2SLS • 2SLS is biased—we’ll talk about this in detail next time but the general idea is: • We must estimate the first stage (e.g. ) • In practice, the first-stage estimates reflect some of the randomness in the endogenous variable (e.g. S) • This randomness generates finite-sample correlations between first-stage fitted values and second stage errors • Endogeneous variable correlated with the second stage errors • Some of that is left in the first stage fitted value • Asymptotically this bias goes to zero but in finite sample might not
Next time: • Issues with IV estimates • Return to Consistency: what about bias? • Weak instruments • Heterogeneous Treatment Effects