1 / 44

The counterfactual logic for public policy evaluation Alberto Martini

The counterfactual logic for public policy evaluation Alberto Martini hard at first, natural later . Everybody likes “ impacts ” (politicians, funders, managing authorities, eurocrates) Impact is the most used and misused term in evaluation. Impacts differ in a

alec-nieves
Download Presentation

The counterfactual logic for public policy evaluation Alberto Martini

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The counterfactual logic for public policy evaluation Alberto Martini hard at first, natural later 

  2. Everybody likes “impacts” (politicians, funders, managing authorities, eurocrates) Impact is the most used and misused term in evaluation

  3. Impactsdiffer in a fundamental way from outputs and results Outputs and results are observable quantities

  4. Can weobserve an impact? No, we can’t This is a major point of departure between this and other paradigms

  5. As output indicatorsmeasureoutputs, asresultindicatorsmeasureresults, so supposedlyimpact indicatorsmeasureimpacts Sorry, they don’t

  6. Almost everything about programmes can be observed (at least in principle): outputs (beneficiaries served, activities done, training courses offered, KM of roads built, sewages cleaned) outcomes/results (income levels, inequality, well-being of the population, pollution, congestion, inflation, unemployment, birth rate)

  7. Unlike outputs and results, to measure impacts one needs to deal with unobservables

  8. To measure impacts, it is not enough to “count” something, or compare results with targets, or to check progress from baseline Itisnecessary to deal with causality

  9. “Causality is in the mind” J.J. Heckman Nobel Prize Economics 2000

  10. How wouldyoudefine impact/effect? “the difference between a situation observedafter an intervention has been implemented and the situation that ………………………………………………………………….. ? would have occurred without the intervention

  11. Thereis just onetinyproblem with thisdefinition the situation that would have occurred without the intervention cannot be observed

  12. The social science scientific community hasdeveloped the notion of potentialoutcomes “given a treatment, the potential outcomes is what we would observe for the same individual for different values of the treatment”

  13. Hollywood’s version of potential outcomes

  14. A priori there are only potential outcomes of the intervention, but later one becomes an observed outcome, while the other becomes the counterfactual outcome

  15. A very intuitive example of the role of counterfactual analysis in producing credible evidence for policy decisions

  16. Does learning and playing chess have a positive impact on achievement in mathematics?

  17. Policy-relevant question: Should we make chess part of the regular curriculum in elementary schools, to improve mathematics achievement? Or would it be a waste of time? Which kind of evidence do we need to make this decision in an informed way?

  18. Let us assume we have a crystal balland we know “truth”:for all pupils we know both potential outcomes—the math score they would obtain if they practiced chess or the score they would obtain if they did not practice chess

  19. General rule:what we observe can be very different than what is true

  20. Types of pupils Whathappens to them High ability 1/3 Practice chess at home and do not gain muchiftaught in school Practice chess only if taught in school, otherwise they do not learn chess Midability 1/3 Lowability 1/3 Unable to play chess effectively, eveniftaught in school

  21. Ifthey do NOT play atschool Ifthey do play chess atschool Potential outcomes difference math test scores 70 50 20 High ability 70 40 20 0 10 0 Midability Lowability Try to memorizethesenumbers: 70 50 40 20 10

  22. SO WE KNOW THAT 1. there is a true impact but it is small 2. the only ones to benefit are mid ability students, for them the impact is 10 points

  23. The naive evidence:observe the differences between chess players and non players and infer something about the impact of chess The differencebetweenplayers and non playersmeasures the effect of playing chess. DO YOU AGREE?

  24. The usefulness of the potential outcome way of reasoning is to make clear what we observe and we do not observe,and what we can learn and cannot learn from the data, and how mistakes are made

  25. What we observe High ability 70 40 Midability average=30 Lowability 20 DO YOU SEE THE POINT?

  26. Results of the direct comparison Pupils who play chess Pupils who do not play chess Average score = 30 points Average score = 70 points Difference = 40 points is this the impact of playing chess?

  27. Can weattribute the difference of 40 points to playing chess alone? There are many more factors at play thatinfluencemathscores OBVIOUSLY NOT

  28. Math test scores CS Play chess Does it have an impact on? DIRDIRE Math ability DIRECT INFLUENCE SELECTION PROCESS Ignoring math ability could severly mislead us, if we intend to interpret the difference in test scores as a causal effect of chess

  29. First (obvious) lesson we learn Mostobserveddifferencestellusnothingaboutcausality Weshould be careful in general to makecausalclaimsbased on the data weobserve

  30. However, comparing math test scores for kids who have learned chess by themselves and kids who have not is pretty silly, isn’t it?

  31. Almost as silly as: Comparing participants of training courses with non participants and calling the difference in subsequent earnings “the impact of training” Comparing enterprises applying for subsidies with those not applying and call the difference in subsequent investment “the impact of the subsidy”

  32. The raw difference between self-selected participants and non-participants is a silly way to apply the counterfactual approach the problem is selection bias (pre-existing differences)

  33. Nowwe decide to teachpupilshow to play chess in school Schools can participate or not

  34. We compare pupils in schoolsthatparticipated in the program and pupils in schoolswhichdidnot in order to get an estimate of the impact of teaching chess in school

  35. We get the following results Pupils in the treated schools Pupils in the non treated schools Average score = 29 points Average score = 53 points Difference = 24 points is this the TRUE impact?

  36. Schools thatdid NOT participate Schools thatparticipated 10% High ability 30% 20% 60% Midability Lowability 10% 70% Thereis an evidentdifference in compositionbetween the twotypes of schools

  37. Schools thatdidNOT Schools thatparticipated High ability 30% 10 % 20 % 60% Midability Lowability 10% 70 % WEIGHTEDAverage of 70, 50 and 20 = 53 WEIGHTED Average of 70, 40 and 20 = 29 Average impact = 53 – 29 = 24

  38. The difference of 24 points is a combination of the true impact and of the difference in composition If we did not know the truth, we might take 24 as the true impact on math score, and being a large impact, make the wrong decision

  39. We have two alternatives:statistically adjusting the data or conducting an experiment The mostt

  40. Any form of adjustment assumes we have a model in mind, we know that ability influences math scores and we know how to measure ability

  41. But even if we do not have all this information we can conduct a randomized experiment The schools who participate get free instructors to teach chess , provided they agree to exclude one classroom at random

  42. Results of the randomized experiment Pupils in the treated classes in the volunteer schools Pupils in the excluded classes in the volunteer schools Average score = 47 points Average score = 53 points Difference = 6 points this is very close the TRUE impact

  43. Schools thatdid NOT volunteer Schools thatvolunteered High ability 30% Midability 60% Lowability 10% random assignment: flip a coin EXPERIMENTALS average of 70, 50 & 20 = 53 CONTROLS mean of 70, 40 & 20 = 47 Impact = 53 – 47 = 6

  44. Experiments are nottheyonly way to identifyimpacts However, itisveryunlikelythat an experimentwill generate grosslymistakenestimates Ifanything, theytend to be biasedtoward zero On the otherhand, some wrongcomparisons can produce wildlymistakenestimates

More Related