Learn about counterfactual methods in impact evaluation, from basic indicators to complex models. Compare strategies for training the long-term unemployed. Discover the strengths and limitations of different approaches.
First steps in practice Daniel Mouqué Evaluation Unit DG REGIO
The story so far… • Indicators useful for management, accountability, but do not give impacts • For impacts, need to estimate a counterfactual
Notice that « classic » methods often imply counterfactuals • Indicators – before vs after • Indicators – with « treatment » vs without • Qualitative methods – expert opinion • Beneficiary surveys – beneficiary opinion • Macromodels – model includes a baseline But all of these have strong assumptions, often implicit
How to weaken the assumptions… … and improve the estimation of impacts Comparison of similar assisted and non-assisted units (finding « twins ») There are various ways to do this - let’s start with a simple example
Training for long term unemployed • Innovative training for those who have been out of work for >12 months • « Classic » evaluation: for those trained, pre-post comparison of employment status, income What’s wrong with this? • So we combine with a beneficiary survey Is this much better?
A simple counterfactual(random assignment) • 10,000 candidates for the training, randomly assign 5000 to training/5000 to traditional support • Compare employment status and earnings one year after training • What’s useful about this? • Can you see any potential problems?
Let’s try again(« discontinuity design ») • Offer the training to all • For evaluation, compare a subset of these with a similar, but non-eligible group: • Unemployed for 12-15 months (eligible) • Unemployed for 9-12 months (not eligible) • What’s better about this than the previous evaluation example? • What’s worse?
3rd time lucky (« pipeline ») • This time we stagger the training over 2 years • 5000 are randomly chosen to take the training this year, 5000 next year • Next year’s treatment group is this year’s control group • What’s good about this? • What limitations can you see?
Some observations Notice: • This is not just one method, but a family of methods • Two families in fact - we’ll come back to this • Different possibilities have different strengths & weaknesses, therefore different applications • Varies from simple to very complicated • We’ll look at common features and requirements now (with Kai)
What do we need? Kai Stryczynski Evaluation Unit DG REGIO
The methods require • Large « n », ie a large number of similar units (to avoid random differences) • Good data for treated and non-treated units • Basic data (who are the beneficiaries?) • Target variables (what is policy trying to change?) • Descriptive variables (eg to help us find matches) • Ability to match the various datasets
Sectoral applicability Good candidates (large « n ») include: • Enterprise support (including R&D) • Labour market and training measures • Other support to individuals (eg social) But…. only where good data exist Bad candidates (small « n ») include: • Large infrastructure (transport, waste water etc) • Networks (eg regional innovation systems)
Rule of thumb < 50% of cases applicable of which < 50% have enough data And even then, be selective. It’s a powerful learning tool, but can be hard work & expensive.
A pragmatic strategy • 2 pronged approach (monitor all, evaluation for a selection) • CIE where can, classic methods where can’t (survey better than nothing) • Mix methods (triangulate, qualitative to explain CIE results) • Be honest and humble about what we know... And don’t know • Use working hypotheses, build picture over time
Let’s get started Daniel Mouqué Evaluation Unit DG REGIO
The options There are many options… …. But two broad families of counterfactual impact evaluation
A « rule of thumb » • Randomised/experimental methods most likely to be useful for: • Pilot projects • Different treatment options (especially genuine policy choices, such as grants vs loans) • Quasi-experimental – more generally applicable • However, randomised simpler, so a good introduction (Quasi-experimental methods in depth tomorrow)
New friends part 1Experimental (« randomised ») methods Some experimental/randomised options for your exercises in the group work: • Random assignment • Pipeline (delaying treatment for some) • Random encouragement Tip: most costly (mess with selection process), but most reliable
New friends part 2Quasi-experimental methods • You don’t need to know all these yet (tomorrow will treat in depth • Intuition: treat as usual, compare with similar, but not quite comparable, treated units • Difference-in-difference • Discontinuity design (comparing « just qualified for treatment » with « just missed it »)
In your group work, we want you to start thinking • What are policy/impact questions in my field(s)? • Can I randomise from the beginning, to get an insight into these results? • Random or not (and often the answer will be not!) can I get outcome data for similar non-treated units?
The set-up • Eastern Germany • Investment and R&D grants to firms • Really increases investment, employment? • Could not randomise (too late, too political) • Clever matching procedures (we’ll tell you more later in the course) to compare similar assisted/non-assisted firms
The results • Investment grants of €8k/employee led to estimated extra investment of €11-12k • Same grants led to an extra 25-30,000 extra jobs • R&D grants of €8k/employee led to €8k extra investment
What does this tell us? This gives comfort to the views: • Enterprise and R&D grants work in lagging regions (at the very least, generate private investment) • Grants have a bigger effect on productivity than on jobs • Gross jobs - especially jobs safeguarded - overstate case
What does this not tell us? We still do not know for certain: • If same pattern would hold outside E. Germany (specific situation, specific selection process) • If investment will translate into long term growth and R&D (but have weakened assumption) • If other instruments better than grants • Crowding out in other enterprises • How to cure cancer (astonishingly, 1 study did not crack all the secrets of the universe) But know more than before, and this is not the last evaluation we will ever do
Potential benefits - motivation for the coming days • Learning what works, by how much (typically) • Learning what instrument is appropriate in a given situation (eg grants or advice to enterprise) • Learning on whom to target assistance (« stratification », eg target training measure on the group most likely to benefit) • Building up a picture over time