280 likes | 515 Views
ENDOGENEITY - SIMULTANEITY. Development Workshop. What is endogeneity and why we do not like it? [REPETITION]. Three causes: X influences Y, but Y reinforces X too Z causes both X and Y fairly contemporaneusly
E N D
ENDOGENEITY - SIMULTANEITY Development Workshop
What is endogeneity and why we do not like it? [REPETITION] • Three causes: • X influences Y, but Y reinforces X too • Z causes both X and Y fairly contemporaneusly • X causes Y, but we cannot observe X and Z (which we observe) is influenced by X but also by Y • Consequences: • No matter how many observations – estimators biased (this is called: inconsistent) • Ergo: whatever point estimates we find, we can’t even tell if they are positive/negative/significant, because we do not know the size of bias + no way to estimate the size of bias
The magic of „ceteris paribus” • Each regression is actually ceteris paribus • Problem: data may be at odds with ceteris paribus • Examples?
Problems with Inferring Causal Effects from Regressions • Regressions tell us about correlations but ‘correlation is not causation’ • Example: Regression of whether currently have health problem on whether have been in hospital in past year: HEALTHPROB | Coef. Std. Err. t ------------+--------------------------------- PATIENT | .262982 .0095126 27.65 _cons | .153447 .003092 49.63 • Do hospitals make you sick? – a causal effect
Observed Factor Unobserved Factor The problem in causal inference in case of simultaneity Confounding Influence Treatment Outcome
Observed Factor Unobserved Factor Any solutions? Confounding Influence Treatment Outcome
Observed Factor Unobserved Factor Instrumental Variables solution… Confounding Influence Treatment Outcome Instrumental Variable(s)
Observed Factor Unobserved Factor Fixed Effects Solution… (DiD does pretty much the same) Fixed Influences Confounding Influence Treatment Outcome
Short motivating story – ALMPs in Poland • Basic statement: 50% of unemployed have found employment because of ALMPs • Facts: • 50% of whom? – only those, who were treated (only those were monitored) • only 90% of treated completed the programmes • of those, who completed, indeed 50% work, but only 60% of these who work say it was because of the programme • So how many actually employed because of the programme?
Completed training … % Product 90 ... found employment... Gross effectiveness ... thanks to programme… 52 Net effectiveness 30 ??? Net efficiency? Short motivating story – ALMPs in Poland
Basic problems in causal inference • Compare somebody „before” and „after” • If they were different already before, the differential will be wrongly attributed to „treatment” • can we measure/capture this inherent difference? • does it stay unchanged „before” and „after”? • what if we only know „after”? • If the difference stays the same => DiD estimator => assumption that cannot be tested for • If the difference cannot be believed to stay the same?
Faked counterfactual or generating a paralel world • MEDICINE: takescontrolgroups – people as sick, whoget a differenttreatmentor a placebo => experimenting • Whatifexperimentimpossible? Seminarium magisterskie - zajęcia 4
What if experiment impossble? Only cross-sectional data Panel data Instrumentalvariables „Propensity Score Matching“ + DiD Before After Estimators „Propensity Score Matching“ Difference in Difference Estimators (DiD) „Regression Discontinuity Design“
Observed Factor Unobserved Factor Propensity Score Matching Confounding Influence Treatment Treatment Outcome
Propensity score matching Average treatment effect E(Y)=E(Y1-Y0)=E(Y1)-Y0 Average treatment effect for the untreated E(Y1-Y0|D=0)=E(Y1|D=0)-E(Y0|D=0) Average treatment effect for the treated (ATT) E(Y1-Y0|D=1)=E(Y1|D=1)-E(Y0|D=1)
Propensity Score Matching • Idea • Compares outcomes of similar units where the only difference is treatment; discards the rest • Example • Low ability students will have lower future achievement, and are also likely to be retained in class • Naïve comparison of untreated/treated students creates bias, where the untreated do better in the post period • Matching methods make the proper comparison • Problems • If similar units do not exist, cannot use this estimator
How to get PSM estimator? • First stage: run „treatment” on observable characteristics • Second stage: estimate the probability of „treatment” • Third stage: compare results of those „treated” and similar non-treated („statistical twinns”) • The less similar they are, the less likely they should be compared one with another
The obtained propensity score is irrelevant (as long as consistent) NEAREST NEIGHBOR (NN) • Pros => tzw. 1:1 • Cons => if 1:1 does not exist, completelysenseless
The obtained propensity score is irrelevant (as long as consistent) CALIPER/RADIUS MATCHING(NN) • Pros => moreelasticthan NN • Cons => whospecifies the radius/caliper?
The obtained propensity score is irrelevant (as long as consistent) Stratification and Interval • Pros => eliminatesdiscretion in radius/caliper choice • Cons => within strata/interval, unitsdon’thave to be „similar” (somepeoplesay 10 strata isql)
The obtained propensity score is irrelevant (as long as consistent) KERNEL MATCHING (KM) • Pros => usesalwaysallobservations • Cons => need to rememberaboutcommonsupport
What is „common support”? • Distributions of pscore may differ substantially across units • Only sensible solutions!
Next week – practical excercise • Read the papers posted on the web • I will post one that we will replicate soon…