Impact Evaluation for Evidence-Based Policy Making

Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead, Africa Impact Evaluation Initiative

Answer Three Questions • Why is evaluation valuable? • What makes a good impact evaluation? • How to implement evaluation?

IE Answers: How do we turn this teacher…

…into this teacher?

Why Evaluate? • Need evidence on what works • Allocate limited budget • Fiscal accountability • Improve program/policy overtime • Operational research • Managing by results • Information key to sustainability • Negotiating budgets • Informing constituents and managing press • Informing donors

Traditional M&E and Impact Evaluation BEHAVIOR • impact evaluation to measure effectiveness (output-outcome) • monitoring to track implementation efficiency (input-output) MONITOR EFFICIENCY INPUTS OUTPUTS OUTCOMES EVALUATE EFFECTIVENESS $$$

Question types and methods • Process Evaluation / Monitoring: • Is program being implemented efficiently? • Is program targeting the right population? • Are outcomes moving in the right • direction? Descriptive analysis • Impact Evaluation: • What was the effect of the program on outcomes? • How would outcomes change under alternative program designs? • Does the program impact people differently (e.g. females, poor, minorities)? • Is the program cost-effective? Causal analysis

Which can be answered by traditional M&E and which by IE? • M&E • IE • M&E • IE • Are books being delivered as planned? • Does de-worming increase school attendance? • What is the correlation between enrollment and school quality? • Does the decentralized school management lead to an increase in learning achievement?

Types of Impact Evaluation • Efficacy: • Proof of Concept • Pilot under ideal conditions • Effectiveness: • At scale • Normal circumstances & capabilities • Lower or higher impact? • Higher or lower costs?

So, use impact evaluation to…. • Test innovations • Scale up what works (e.g. de-worming) • Cut/change what does not (e.g. HIV counseling) • Measure effectiveness of programs (e.g. JTPA ) • Find best tactics to e.g. change people’s behavior (e.g. come to the clinic) • Manage expectations e.g. PROGRESA/OPORTUNIDADES (Mexico) • Transition across presidential terms • Expansion to 5 million households • Change in benefits • Battle with the press

Next question please • Why is evaluation valuable? • What makes a good impact evaluation? • How to implement evaluation?

Assessing impact • examples • How much do girl scholarships increase school enrollment? • What is the level of beneficiary’s learning achievement with program compared to without program? • Compare same individual with & without programs at the same point in time • Never observe same individual with and without program at same point in time

Solving the evaluation problem • Counterfactual: what would have happened without the program • Need to estimate counterfactual • i.e. find a control or comparison group • Counterfactual Criteria • Treated & counterfactual groups have identical initial characteristics on average, • Only reason for the difference in outcomes is due to the intervention

2 “Counterfeit” Counterfactuals • Before and after: • Same individual before the treatment • Non-Participants: • Those who choose not to enroll in program • Those who were not offered the program

Before and After Example • Food Aid • Compare mortality before and after • Find increase in mortality • Did the program fail? • “Before” normal year, but “after” famine year • Cannot separate (identify) effect of food aid from effect of drought

Before and After • Compare Y before and after intervention B Before-after counterfactual A-B Estimated impact • Control for time varying factors C True counterfactual A-C True impact A-B is under-estimated Y Before After C A B B t-1 t Time Treatment

Non-Participants…. • Compare non-participants to participants • Counterfactual: non-participant outcomes • Problem: why did they not participate?

Exercise: Why participants and non-participants differ? Access to school Poorer Unmet demand More organized community Achievement Poverty Gender • Children who come to school and children who do not? • Communities that applied for funds for a new classroom and communities that did not? • Children who received scholarships and children who did not?

Literacy program example • Treatment offered • Who signs up? • Those who are illiterate • Have lower education than those who do not sign up • Educated people are a poor estimate of counterfactual

What's wrong? • Selection bias: People choose to participate for specific reasons • Many times reasons are directly related to the outcome of interest • Cannot separately identify impact of the program from these other factors/reasons

Program placement example • Government offers school inputs program to schools with low infrastructure • Compare achievement in schools offered program to achievement in schools not offered • Program targeted based on lack of inputs, so • Treatments have low achievement • Counterfactuals have high achievement • Cannot separately identify program impact from school targeting criteria

Need to know… • Why some get program and others do not • How some get into treatment and other in control group • If reasons correlated with outcome • cannot identify/separate program impact from • other explanations of differences in outcomes • The process by which data is generated

Possible Solutions… • Guarantee comparability of treatment and control groups • ONLY remaining difference is intervention • In this workshop we will consider • Experimental design/randomization • Quasi-experiments • Regression Discontinuity • Double differences • Instrumental Variables

These solutions all involve… • Randomization • Give all equal chance of being in control or treatment groups • Guarantees that all factors/characteristics will be on average equal between groups • Only difference is the intervention • If not, need transparent & observable criteria for who is offered program

The Last Question • Why is evaluation valuable? • What makes a good impact evaluation? • How to implement evaluation?

Implementation Issues • Political economy • Policy context • Finding a good control • Retrospective versus prospective designs • Making the design compatible with operations • Ethical Issues • Relationship to “results” monitoring

Political Economy • What is the policy purpose? • In USA test innovations to national policy, defend budget • In RSA answer electorate • In Mexico allocate budget to poverty programs • In IDA country pressure to demonstrate aid effectiveness and scale up • In poor country hard constraints and ambitious targets: how to reach those targets?

Evidence culture and incentives for change • Cultural shift • From retrospective evaluation • Look back and judge • To prospective evaluation • Decide what need to learn • Experiment with alternatives • Measure and inform • Adopt better alternatives overtime • Change in incentives • Rewards for changing programs that do not work • Rewards for generating knowledge • Separating job performance from knowledge generation

The Policy Context • Address policy-relevant questions: • What policy questions need answers? • What outcomes answer those questions? • What indicators measures outcomes? • How much of a change in the outcomes would determine success? • Example: teacher performance-based pay • Scale up pilot? • Criteria: Need at least a 10% increase in test scores with no change in unit costs

Opportunities for good designs • Use opportunities to generate good control groups • Most programs cannot deliver benefits to all those eligible • Budgetary limitations: • Eligible who get it are potential treatments • Eligible who do not are potential controls • Logistical limitations: • Those who go first are potential treatments • Those who go later are potential controls

Who gets the program? • Eligibility criteria • Are benefits targeted? • How are they targeted? • Can we rank eligible's priority? • Are measures good enough for fine rankings? Who goes first? • Roll out • Equal chance to go first, second, third?

Ethical Considerations • Do not delay benefits: Rollout based on budget/administrative constraints • Equity: equally deserving beneficiaries deserve an equal chance of going first • Transparent & accountable method • Give everyone eligible an equal chance • If rank based on some criteria, then criteria should be quantitative and public

Retrospective Designs • Hard to find good control groups • Must live with arbitrary or unobservable allocation rules • Administrative data • good enough to reflect program was implemented as described • Need pre-intervention baseline survey • On both controls and treatments • With covariates to control for initial differences • Without baseline difficult to use quasi-experimental methods

Manage for results • Retrospective evaluation cannot be used to manage for results • Use resources wisely: do prospective evaluation design • Better methods • More tailored policy questions • Precise estimates • Timely feedback and program changes • Improve results on the ground

Monitoring Systems • Projects/programs regularly collect data for management purposes • Typical content • Lists of beneficiaries • Distribution of benefits • Expenditures • Outputs • Ongoing process evaluation • Information is needed for impact evaluation

Evaluation uses administrative information to: • Verify who is beneficiary • When started • What benefits were actually delivered Necessary condition for program to have an impact: • benefits need to get to targeted beneficiaries

Improve use of administrative data for IE • Program monitoring data usually only collected in areas where active • Collect baseline for control areas as well • Very cost-effective as little need for additional special surveys • Add a couple of outcome indicators • Most IE’s use only monitoring data

Overall Messages • Impact evaluation useful for • Validating program design • Adjusting program structure • Communicating to finance ministry & civil society • A good evaluation design requires estimating the counterfactual • What would have happened to beneficiaries if had not received the program • Need to know all reasons why beneficiaries got program & others did not

Design Messages • Address policy questions • Interesting is what government needs and will use • Stakeholder buy-in • Easiest to use prospective designs • Good monitoring systems & administrative data can improve IE and lower costs

Impact Evaluation for Evidence-Based Policy Making