410 likes | 616 Views
Impact Evaluation for Evidence-Based Policy Making. Arianna Legovini Lead, Africa Impact Evaluation Initiative AFTRL. Answer Three Questions. Why is evaluation valuable? What makes a good impact evaluation? How to implement evaluation?. IE Answers: How do we turn this teacher….
E N D
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead, Africa Impact Evaluation Initiative AFTRL
Answer Three Questions • Why is evaluation valuable? • What makes a good impact evaluation? • How to implement evaluation?
Why Evaluate? • Need evidence on what works • Allocate limited budget • Fiscal accountability • Improve program/policy overtime • Operational research • Managing by results • Information key to sustainability • Negotiating budgets • Informing constituents and managing press • Informing donors
BEHAVIOR What is different between traditional M&E and Impact Evaluation? • impact evaluation to measure effectiveness (output-outcome) • monitoring to track implementation efficiency (input-output) MONITOR EFFICIENCY INPUTS OUTPUTS OUTCOMES EVALUATE EFFECTIVENESS $$$
Question types and methods • Process Evaluation / Monitoring: • Is program being implemented efficiently? • Is program targeting the right population? • Are outcomes moving in the right • direction? Descriptive analysis • Impact Evaluation: • What was the effect of the program on outcomes? • How would outcomes change under alternative program designs? • Does the program impact people differently (e.g. females, poor, minorities)? • Is the program cost-effective? Causal analysis
Which can be answered by traditional M&E and which by IE? • Are ITNs being delivered as planned? • Does school-based delivery of malaria treatment increase school attendance? • What is the correlation between health coverage and under fives receiving treatment within 24 hr of fever outbreak? • Does the house-to-house approach lead to an increase in under fives sleeping under ITNs relative to level in communities with other community-based approaches? • M&E • IE • M&E • IE
Types of Impact Evaluation • Efficacy: • Proof of Concept • Pilot under ideal conditions • Effectiveness: • At scale • Normal circumstances & capabilities • Lower or higher impact? • Higher or lower costs?
So, use impact evaluation to…. • Test innovations • Scale up what works (e.g. de-worming) • Cut/change what does not (e.g. HIV counseling) • Measure effectiveness of programs (e.g. JTPA ) • Find best tactics to e.g. change people’s behavior (e.g. come to the clinic) • Manage expectations e.g. PROGRESA/OPORTUNIDADES (Mexico) • Transition across presidential terms • Expansion to 5 million households • Change in benefits • Battle with the press
Next question please • Why is evaluation valuable? • What makes a good impact evaluation? • How to implement evaluation?
Assessing impact • examples • How much does an anti-malaria program lower under-five mortality? • What is the beneficiary’s health status with program compared to without program? • Compare same individual with & without programs at the same point in time • Never observe same individual with and without program at same point in time
Solving the evaluation problem • Counterfactual: what would have happened without the program • Need to estimate counterfactual • i.e. find a control or comparison group • Counterfactual Criteria • Treated & counterfactual groups have identical initial characteristics on average, • Only reason for the difference in outcomes is due to the intervention
2 “Counterfeit” Counterfactuals • Before and after: • Same individual before the treatment • Non-Participants: • Those who choose not to enroll in program • Those who were not offered the program
Before and After Example • Food Aid • Compare mortality before and after • Find increase in mortality • Did the program fail? • “Before” normal year, but “after” famine year • Cannot separate (identify) effect of food aid from effect of drought • Epidemic
Before and After • Compare Y before and after intervention B Before-after counterfactual A-B Estimated impact • Control for time varying factors C True counterfactual A-C True impact A-B is under-estimated Y Before After C A B B t-1 t Time Treatment
Non-Participants…. • Compare non-participants to participants • Counterfactual: non-participant outcomes • Problem: why did they not participate?
Exercise: Why participants and non-participants might differ? • Mothers who came to the health unit for ORT and mothers who did not? • Communities that applied for funds for IRT and communities that did not? • Children who received ACT and children who did not? Child had diarrhea Access to clinic Costal and mountain Epidemic and non-epidemic Child had fever Access to clinic
Health program example • Treatment offered • Who signs up? • Those who are sick • Areas with epidemics • Have lower health status that those who do not sign up • Healthy people/communities are a poor estimate of counterfactual
Health insurance example • Health insurance offered • Who buys health insurance? • Who does not buy? • Compare health care utilization of those who got insurance to those who did not • Cannot separately identify impact of insurance and utilization on health
What's wrong? • Selection bias: People choose to participate for specific reasons • Many times reasons are directly related to the outcome of interest • Health Insurance: health status and medical expenditures • Cannot separately identify impact of the program from these other factors/reasons
Program placement example • Government offers family planning program to villages with high fertility • Compare fertility in villages offered program to fertility in villages not offered • Program targeted based on fertility, so • Treatments have high fertility • Counterfactuals have low fertility • Cannot separately identify program impact from geographic targeting criteria
Need to know… • Why some get program and others do not • How some get into treatment and other in control group • If reasons correlated with outcome • cannot identify/separate program impact from • other explanations of differences in outcomes • The process by which data is generated
Possible Solutions… • Guarantee comparability of treatment and control groups • ONLY remaining difference is intervention • In this workshop we will consider • Experimental design/randomization • Quasi-experiments • Regression Discontinuity • Double differences • Instrumental Variables
These solutions all involve… • Randomization • Give all equal chance of being in control or treatment groups • Guarantees that all factors/characteristics will be on average equal between groups • Only difference is the intervention • If not, need transparent & observable criteria for who is offered program
The Last Question • Why is evaluation valuable? • What makes a good impact evaluation? • How to implement evaluation?
Implementation Issues • Political economy • Policy context • Finding a good control • Retrospective versus prospective designs • Making the design compatible with operations • Ethical Issues • Relationship to “results” monitoring
Political Economy • What is the policy purpose? • In USA derail from national policy, defend budget • In RSA answer electorate • In Mexico allocate budget to poverty programs • In IDA country pressure to demonstrate aid effectiveness and scale up • In poor country hard constraints and ambitious targets
Political Economy • Cultural shift • From retrospective evaluation • Look back and judge • To prospective evaluation • Decide what need to learn • Experiment with alternatives • Measure and inform • Adopt better alternatives overtime • Change in incentives • Rewards for changing programs that do not work • Rewards for generating knowledge • Separating job performance from knowledge generation
The Policy Context • Address policy-relevant questions: • What policy questions need to be answered? • What outcomes answer those questions? • What indicators measures outcomes? • How much of a change in the outcomes would determine success? • Example: teacher performance-based pay • Scale up pilot? • Criteria: Need at least a 10% increase in test scores with no change in unit costs
Prospective designs • Use opportunities to generate good control groups • Most programs cannot deliver benefits to all those eligible • Budgetary limitations: • Eligible who get it are potential treatments • Eligible who do not are potential controls • Logistical limitations: • Those who go first are potential treatments • Those who go later are potential controls
Who gets the program? • Eligibility criteria • Are benefits targeted? • How are they targeted? • Can we rank eligible's priority? • Are measures good enough for fine rankings? Who goes first? • Roll out • Equal chance to go first, second, third?
Ethical Considerations • Do not delay benefits: Rollout based on budget/administrative constraints • Equity: equally deserving beneficiaries deserve an equal chance of going first • Transparent & accountable method • Give everyone eligible an equal chance • If rank based on some criteria, then criteria should be quantitative and public
Retrospective Designs • Hard to find good control groups • Must live with arbitrary or unobservable allocation rules • Administrative data • good enough to reflect program was implemented as described • Need pre-intervention baseline survey • On both controls and treatments • With covariates to control for initial differences • Without baseline difficult to use quasi-experimental methods
Manage for results • Retrospective evaluation cannot be used to manage for results • Use resources wisely: do prospective evaluation design • Better methods • More tailored policy questions • Precise estimates • Timely feedback and program changes • Improve results on the ground
Monitoring Systems • Projects/programs regularly collect data for management purposes • Typical content • Lists of beneficiaries • Distribution of benefits • Expenditures • Outcomes • Ongoing process evaluation • Information is needed for impact evaluation
Evaluation uses information to: • Verify who is beneficiary • When started • What benefits were actually delivered Necessary condition for program to have an impact: • benefits need to get to targeted beneficiaries
Improve use of monitoring data for IE • Program monitoring data usually only collected in areas where active • Collect baseline for control areas as well • Very cost-effective as little need for additional special surveys • Add a couple of outcome indicators • Most IE’s use only monitoring data
Overall Messages • Impact evaluation useful for • Validating program design • Adjusting program structure • Communicating to finance ministry & civil society • A good evaluation design requires estimating the counterfactual • What would have happened to beneficiaries if had not received the program • Need to know all reasons why beneficiaries got program & others did not
Design Messages • Address policy questions • Interesting is what government needs and will use • Stakeholder buy-in • Easiest to use prospective designs • Good monitoring systems & administrative data can improve IE and lower costs