Exploring Impact Evaluation Methods: Counterfactual Frameworks

First steps in practice Daniel Mouqué Evaluation Unit DG REGIO

The story so far… • Indicators useful for management, accountability, but do not give impacts • For impacts, need to estimate a counterfactual

Notice that « classic » methods often imply counterfactuals • Indicators – before vs after • Indicators – with « treatment » vs without • Qualitative methods – expert opinion • Beneficiary surveys – beneficiary opinion • Macromodels – model includes a baseline But all of these have strong assumptions, often implicit

How to weaken the assumptions… … and improve the estimation of impacts Comparison of similar assisted and non-assisted units (finding « twins ») There are various ways to do this - let’s start with a simple example

Training for long term unemployed • Innovative training for those who have been out of work for >12 months • « Classic » evaluation: for those trained, pre-post comparison of employment status, income What’s wrong with this? • So we combine with a beneficiary survey Is this much better?

A simple counterfactual(random assignment) • 10,000 candidates for the training, randomly assign 5000 to training/5000 to traditional support • Compare employment status and earnings one year after training • What’s useful about this? • Can you see any potential problems?

Let’s try again(« discontinuity design ») • Offer the training to all • For evaluation, compare a subset of these with a similar, but non-eligible group: • Unemployed for 12-15 months (eligible) • Unemployed for 9-12 months (not eligible) • What’s better about this than the previous evaluation example? • What’s worse?

3rd time lucky (« pipeline ») • This time we stagger the training over 2 years • 5000 are randomly chosen to take the training this year, 5000 next year • Next year’s treatment group is this year’s control group • What’s good about this? • What limitations can you see?

Some observations Notice: • This is not just one method, but a family of methods • Two families in fact - we’ll come back to this • Different possibilities have different strengths & weaknesses, therefore different applications • Varies from simple to very complicated • We’ll look at common features and requirements now (with Kai)

What do we need? Kai Stryczynski Evaluation Unit DG REGIO

The methods require • Large « n », ie a large number of similar units (to avoid random differences) • Good data for treated and non-treated units • Basic data (who are the beneficiaries?) • Target variables (what is policy trying to change?) • Descriptive variables (eg to help us find matches) • Ability to match the various datasets

Sectoral applicability Good candidates (large « n ») include: • Enterprise support (including R&D) • Labour market and training measures • Other support to individuals (eg social) But…. only where good data exist Bad candidates (small « n ») include: • Large infrastructure (transport, waste water etc) • Networks (eg regional innovation systems)

Rule of thumb < 50% of cases applicable of which < 50% have enough data And even then, be selective. It’s a powerful learning tool, but can be hard work & expensive.

A pragmatic strategy • 2 pronged approach (monitor all, evaluation for a selection) • CIE where can, classic methods where can’t (survey better than nothing) • Mix methods (triangulate, qualitative to explain CIE results) • Be honest and humble about what we know... And don’t know • Use working hypotheses, build picture over time

Let’s get started Daniel Mouqué Evaluation Unit DG REGIO

The options There are many options… …. But two broad families of counterfactual impact evaluation

A « rule of thumb » • Randomised/experimental methods most likely to be useful for: • Pilot projects • Different treatment options (especially genuine policy choices, such as grants vs loans) • Quasi-experimental – more generally applicable • However, randomised simpler, so a good introduction (Quasi-experimental methods in depth tomorrow)

New friends part 1Experimental (« randomised ») methods Some experimental/randomised options for your exercises in the group work: • Random assignment • Pipeline (delaying treatment for some) • Random encouragement Tip: most costly (mess with selection process), but most reliable

New friends part 2Quasi-experimental methods • You don’t need to know all these yet (tomorrow will treat in depth • Intuition: treat as usual, compare with similar, but not quite comparable, treated units • Difference-in-difference • Discontinuity design (comparing « just qualified for treatment » with « just missed it »)

In your group work, we want you to start thinking • What are policy/impact questions in my field(s)? • Can I randomise from the beginning, to get an insight into these results? • Random or not (and often the answer will be not!) can I get outcome data for similar non-treated units?

To clarify, a real example (from enterprise support)

The set-up • Eastern Germany • Investment and R&D grants to firms • Really increases investment, employment? • Could not randomise (too late, too political) • Clever matching procedures (we’ll tell you more later in the course) to compare similar assisted/non-assisted firms

The results • Investment grants of €8k/employee led to estimated extra investment of €11-12k • Same grants led to an extra 25-30,000 extra jobs • R&D grants of €8k/employee led to €8k extra investment

What does this tell us? This gives comfort to the views: • Enterprise and R&D grants work in lagging regions (at the very least, generate private investment) • Grants have a bigger effect on productivity than on jobs • Gross jobs - especially jobs safeguarded - overstate case

What does this not tell us? We still do not know for certain: • If same pattern would hold outside E. Germany (specific situation, specific selection process) • If investment will translate into long term growth and R&D (but have weakened assumption) • If other instruments better than grants • Crowding out in other enterprises • How to cure cancer (astonishingly, 1 study did not crack all the secrets of the universe) But know more than before, and this is not the last evaluation we will ever do

Potential benefits - motivation for the coming days • Learning what works, by how much (typically) • Learning what instrument is appropriate in a given situation (eg grants or advice to enterprise) • Learning on whom to target assistance (« stratification », eg target training measure on the group most likely to benefit) • Building up a picture over time

Exploring Impact Evaluation Methods: Counterfactual Frameworks

Exploring Impact Evaluation Methods: Counterfactual Frameworks

Presentation Transcript

First Steps in Modularization

Educational First Steps

First Steps in Verilog

First steps in practice

A Career in Practice First Steps for Anthropologists

First Steps

First Steps in the Clouds

First steps in ArcGIS

First steps with

My First Steps In Lefkada

... first steps

FIRST STEPS:

First Steps in Web Design

FIRST STEPS

First Steps in Exporting

First Steps in the Clouds

First analysis steps

First in 2009: Progress Steps

FIRST STEPS

First Steps

First steps in SparkR