1 / 46

Impact Evaluation Using Non-Experimental Methods

Learn methods for establishing causation in impact evaluation, comparing treatment and control groups, dealing with selection bias, and analyzing outcomes over time. Discover tools like difference-in-differences and regression discontinuity design.

stanleyb
Download Presentation

Impact Evaluation Using Non-Experimental Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measuring Impact: Non-Experimental Methods Bilal Siddiqi Istanbul, May 12, 2015

  2. Motivation

  3. Motivation

  4. Lesson Number 1: Correlation does not imply causation • Correlation: two things move together • Causation: one thing causes the other

  5. Impact evaluation is all about causation! Does the intervention (project/policy) CAUSE (good/bad) impacts on the beneficiaries?

  6. How do we establish causation in an IE? WHAT WOULD HAVE HAPPENED IN THE ABSENCE OF THE INTERVENTION WHAT HAPPENED WITH Need to find the counterfactual So we can compare

  7. Counterfactual criteria • Need a “control group” to compare with our “treatment group” • Treatment and control groups have similar initial characteristics • on average • observedand unobserved • The only difference is that one group received the treatment • The only reason observed outcomes are different is due to the treatment

  8. In search of a counterfactualWhich tools?

  9. In search of a counterfactualNon-experimental tools

  10. What is counterfactual analysis? • Compare (statistically) identical groups of individuals • with & without intervention • at the same point in time What can non-experimental methods do? • Comparesimilargroups • trying to make them as close to identical as possible

  11. Case study:Returns to capital in microenterprises • Problem: Small firms are credit constrained • Intervention: one-time increases to capital stock – $100 and $200 • Main outcome: profit rates • Some figures: • 800 firms at the baseline (2007) • More than 50% of the sampled firms invest less than $200 • 300 firms applied and received financing

  12. How we can evaluate this?Participants–nonparticipants Idea: compare profit rates of firms that applied and received credit with those that did not.

  13. Participants-nonparticipants • Problem: Selection Bias.Why did only 300 firms opt in? • Better performers anyways (observable) • Better entrepreneurs, better informed (unobservable) Parts of this presentation build on material from Impact Evaluation in Practice www.worldbank.org/ieinpractice

  14. How we can evaluate this?Before-after Idea: compare real profits of treated firms before and after the subsidized credit policy.

  15. Before-after 2008 2007 • Problem: Time difference.Other things may have changed over time. • An alternative program for untreated firms • Untreated firms did much worse because they did not use the credit

  16. These two tools are wrong for IE Before-after Participants-nonparticipants Compare: Same subjects Before and After they receive an intervention. Compare: Group of subjects treated (participants) with group that chooses not to be treated (non participants) Problem: Other things may have happened over time. Problem: Selection bias. We do not know why they are participating. Both tools lead to biased estimates of the counterfactual and the impact.

  17. Legovini Before-after and Monitoring • Monitoring tracks indicators over time • among participants only • It is descriptive before-after analysis • It tells us whether things are moving in the right direction • It does not tell us why things happen or how to make more happen

  18. Legovini Impact Evaluation • Tracks average outcomes over time in • the treatment group relative to • the control group • Compares • what DID happen with • what WOULD HAVE happened (counterfactual) • Identifies a causal effect • controlling for ALL other time-varying factors

  19. Non-Experimental Methods 1. Difference-in-differences (Diff-in-Diff ) 2. Diff-in-Diff with matching 3. Regression discontinuity design (RDD)

  20. Non-Experimental Methods 1. Difference-in-differences (Diff-in-Diff ) 2. Diff-in-Diff with matching 3. Regression discontinuity design (RDD)

  21. How we can evaluate this?Difference-in-differences • Idea: combine the time dimension (before-after) with the participation choice (participants-nonparticipants) • (under some assumptions) this deals with the problems above: • Time differences. Other things that happened over time affect both participants and nonparticipants • Selection bias. We do not know why they are participating, but if the reason does not change over time…

  22. Before-after Impact = (P2008-P2007)= 2.1 – 1.5 = + 0.6 %

  23. Before-after + P-NP = Diff-in-Diff Impact = (P2008-P2007)-(NP2008-NP2007) = 0.6 – 0.2 = + 0.4 % NP08-NP07=0.2

  24. You can use a table instead…

  25. Assumption of common time-trend Impact = +0.4 pp

  26. Conclusion The program had a positive effect on profits for firms that used subsidized credit Is the “common time-trend” assumption plausible?

  27. If we have historical data, we can use thisto 'test' the assumption

  28. Difference-in-Differences Difference-in-differences combines Participants-nonparticipants with Before-after. It deals with problems of previous methods under the… …fundamental assumption Trends are the same in treatments and controls Possible to test if you have data pre-treatment Improve diff-in-diff if you match groups based on observable characteristics (propensity score matching) at the baseline Deals with unobserved characteristics only if constant over time

  29. Non-Experimental Methods 1. Difference-in-differences (Diff-in-Diff ) 2. Diff-in-Diff with matching 3. Regression discontinuity design (RDD)

  30. Diff-in-Diff with matching What is the intuition of matching techniques? • The intervention targets firms with characteristics we can observe • We can use these characteristics to find firms similar to the ones that participated • These firms could be a good comparison group • In practice we use an index (“propensity score”) of characteristics and compare groups with similar values of the index

  31. Matching… • Challenge: finding nonparticipants that compare with all participants • Example Common support non-participants Participants Index

  32. It is a bit complicated in practice! Source: Caliendro, 2008: 33

  33. Don't worry, there is an easy way of avoiding all this! Source: Caliendro, 2008: 33

  34. Summary of impacts so far • If method is weak this can lead to incorrect impact estimates and wrong policy conclusions • Participants-nonparticipants and Before-after are not good methods for causal impact • Difference-in-differences is valid under some (often strong) assumptions

  35. Non-Experimental Methods 1. Difference-in-differences (DD) 2. DD with matching 3. Regression discontinuity design (RDD)

  36. How we can evaluate this?Regression Discontinuity Design Case: subsidies offered on the basis of credit constraint score All firms that apply are scored on age, revenue, profitability, number of employees, and access to different sources of credit. Score ranges from 0 to 100 where 100 means no credit constraint and 0 means high credit constraint The program aims to help the most needy firms. Thus the program is targeted to firms with score < = 50 Idea: compare profits of firms with score just below 50(eligible for subsidized credit)…. ….with firms with scores just above 50 (ineligible for subsidized credit).

  37. Profit rate 3% 2.5% 2% 1.5% Eligible firms Non-eligible firms Fonte: WB – Human Development Network.

  38. Regression Discontinuity Design-Post Intervention RDD identifies the Local Average Treatment Effect (LATE) Profit rate 3% 2.5% 2.0% Treatment Effect 1.5% Fonte: WB – Human Development Network.

  39. Regression discontinuity Important: Valid only for those subjects that are close to the cut-off pointthat defines who is eligible to the program Is this the group you want to know about? • Powerful method if you have: • Continuous eligibility index • Clearly defined eligibility cut-off. • It gives a causal impact but with a local interpretation

  40. Summary of impacts so far • Weak methods can lead to very misleading results • RD (causal impact) is only around half of the impact estimated with the other weaker methods. • Valid results from IE only if you use rigorous methods.

  41. Hopefully, you are now questioning everything...

  42. Other names: Randomized Controlled Trials (RCTs) or Randomization Assignment to Treatment and Control is based on chance (like flipping a coin) Treatment and Control groups will have identical characteristics (balanced) at baseline. Only difference is that treatment receives intervention, control does not

  43. Experiments: plan for next session

  44. WEB http://dime.worldbank.org Thank you! facebook.com/ieKnow #impacteval blogs.worldbank.org/impactevaluations microdata.worldbank.org/index.php/catalog/impact_evaluation

More Related