Counterfactual impact evaluation: what it can (and cannot) do for cohesion policy

Counterfactual impact evaluation: what it can (and cannot) do for cohesion policy Alberto Martini Progetto Valutazione Torino, Italy amartini@prova.org

Share. • Play fair. • Don't hit people. • Clean up your own mess. • Wash your hands before you eat. • Flush. ALL I REALLY NEED TO KNOW I LEARNED IN KINDERGARTEN by Robert Fulghum

It’s nice to have an impact. • Not all we obtain is due to our actions. • Some things happen without our help. • To improve things we must understand’em • We must separate what we caused fromwhat would happen anyway. • Flush. ALL IT REALLY MATTERS IN IMPACT EVALUATION COMES FROM COMMON SENSE

The answer is simple: it depends on what we need (can, want) to know and for which purpose • I’ll follow the COSCE approach (Common Sensical Counterfactual Evaluation) • [COSCE = Conference On Security and Cooperation in Europe] What would have happened anyway = counterfactual Do we need counterfactuals?

If your purpose is to be accountable, don’t worry too much about counterfactuals • Your main worry is to show that the money was spent • Maybe you want to show how well it was spent • Maybe you want to show for whom it was spent • You might go further by showing your contribution to objectives; e.g. to the Lisbon strategy • To impress DG-Regio, use a macro-model COSCE rule n.1

If your purpose is to improve policy, macro models will not do • If your purpose is to improve policy, probably indicators will not do • If your purpose is to improve policy, you need to learn: • What works and, if it does, why it works • What does not work and, if it doesn’t, why it doesn’t work COSCE rule n.2

Learning “what works” precedes logically learning “why it works” • Otherwise we do not know what to explain • Learning why it works (or doesn’t) is: • More important • More interesting • More difficult than learning what works • This is why it should be done later COSCE rule n.3

Counterfactul Impact Evaluation tries to learn something about “what works” on average (not very interesting) and for whom it works (data permitting) • It produces numbers • It requires good data and large samples • It imposes non-testable assumptions • Its results are NOT the truth, are NOT universal laws, are NOT scientific • It is (should be) a fallible, improvable, intellectually honest human enterprise COSCE rule n.4

Theory-based Impact Evaluation tries to learn something about “why it works” indentifying the mechanisms that make a policy produce its effects (or fail to do so) • It produces narratives and insights • It collects its data through qualitative methods and doesn’t need large samples • It develops a theory of change and then observes policies as they are implemented, to learn which elements of the theory are verified COSCE rule n. 5

To learn something about “what works” one needs to clarify • Effects (impacts) on what? Which outcomes Y • Effects (impacts) of what? Which treatment T COSCE curse n. 1 Effects and impacts are the same thing, the best example of distinction without a difference COSCE rule n.6

The heart of CIE is to answer the question: what is the direction, size and significance of the effect of treatment T of outcome Y? AN EXAMPLE A program providing subsidies to increase R&D expenditures among small and medium enterprises “subsidizing SME to do more R&D” COSCE rule n. 7

A MULTIPLE CHOICE TESTWhat is the effect of the subsidies? the number of R&D projects funded and completed the take-up rate of the subsidy among eligible SME the increase in R&D expenditures among subsidized SME the difference in R&D expenditure among subsidized and non subsidized SME none of the above

the number of R&D projects funded and completed the take-up rate of the subsidy among eligible SME The number can be very high, the take-up rate can be 100 %, the effect can be zero COSCE curse n. 2 the number of R&D projects is not a “gross impact”. It is not an impact. It’s a measure of activity. There is no such thing as a gross impact

The increase in not an effect, the subsidies might have gone to firms with growing R&D expenditures the increase in R&D expenditures among subsidized SME COSCE curse n. 3 The deadweight (DW) is nothing else than the counterfactual. The only special thing about it is that is used when money is clearly wasted. Demonstrable Waste (DW) is a better name for it

The post-treatment difference in outcomes does not identify any effect, the difference might be all due to initial differences (selection bias) COSCE curse n. 3 The Commission is stuck on the decomposition “gross impact=net effect + deadweight” The world literature focuses on the decomposition “observed difference=effect + selection bias” the difference in R&D expenditure among subsidized and non subsidized SME

The world-wide social science literature has made substantial advances to reduce, prevent or eliminate selection bias, and estimates effect by comparing treated and not treated • exploiting random assignment when feasible and a variety of (ever developing) • non experimental methods • matching • double difference • discontinuity • instrumental variables

What does COSCE have to say about the limitations of counterfactual impact evaluation? In some quarters, CIE is seen as a universal approach, able to solve all the inferential problems through use of ever more sophisticated methods. COSCE disagree and views the CIE as an important contribution, with important limitations in their applicability to Structural Funds, both in terms or relevance and compatibility.

Human capital investment Transport infra-structure Investment support Urban renewal Support for R&D projects Renewable energy Different types of cohesion policies Relevance and compatibility Behavioral (vs. redistributive) motive Replicable nature (vs. idiosyncratic) Large numbers of eligible units Homogenous treatment (vs. composite)

What timing for counterfactual impact evaluation? When it is prospective, i.e. it is designed together with the intervention, impact evaluation can have a strong disciplinary effect. First, it can help focus the attention of both policy-makers and beneficiaries on objectives. Secondly, it creates an incentive to assemble the information necessary to assess results. Thirdly, it brings to light the criteria by which beneficiaries are selected BARCA DIXIT

Above timing, above relevance, above compatibility, the most important determinant of the diffusion of counterfactual impact evaluation is the interest and willingness, on the part of some influential stakeholder, of truly learning about what works and why.

Counterfactual impact evaluation: what it can (and cannot) do for cohesion policy

Counterfactual impact evaluation: what it can (and cannot) do for cohesion policy

Presentation Transcript

Program Evaluation What is it?

Chapter 2: Matter and How It Can Change

Impact Evaluation:

Impact Evaluation: why, what and how?

Monitoring, Evaluation and (Experimental) Impact Evaluation

The Impact of Counterfactual Thinking on Persuasion

Can you draw it?

Can it Really Rain Frogs?

HE in FE What is it and what can it offer?

Economic Impact – what is it?

Counterfactual thinking

Clinical Legal Education - what it can/cannot achieve?

Proper IT Infrastructure Services Can Help Achieve Many Things

Tummy Tuck Surgery: What It Can and Cannot Fix

What You Can And Cannot Recover From Your iPhone's Data