430 likes | 637 Views
Getting what we pay for: impact evaluation for better planning and budgeting Regional conference on public sector management in support of the MDGs Bangkok, June 2012. Howard White International Initiative for Impact Evaluation. Impact evaluation: an example.
E N D
Getting what we pay for: impact evaluation for better planning and budgeting Regional conference on public sector management in support of the MDGsBangkok, June 2012 Howard White International Initiative for Impact Evaluation
Impact evaluation: an example The case of the Bangladesh Integrated Nutrition Project (BINP) Why did the Bangladesh Integrated Nutrition Program (BINP) fail?
The theory of change Right target group for nutritional counselling PARTICIPATION RATES WERE UP TO 30% LOWER FOR WOMEN LIVING WITH THEIR MOTHER-IN-LAW
The theory of change Knowledge acquired and used
The theory of change The right children are enrolled in the programme
The theory of change Supplementary feeding is supplementary
Lessons from BINP • Apparent successes can turn out to be failures • Outcome monitoring does not tell us impact and can be misleading • A theory based impact evaluation shows if something is working and why • Quality of match for rigorous study • Independent study got different findings from project commissioned study
Stipends in rural China • Enrolments rose from 40 to 92 percent in project areas • So stipends “caused” growing enrolments amongst girls
“Results reporting” Results… cannot as a rule be attributed specifically, either wholly or in part, to the Netherlands. (Results report 2005-06)
Development effectiveness = how effective are development programmes = what difference did they make • To measure this we need impact evaluation • Results are what we achieved, not what would have happened anyway • So outcome monitoring is not enough
Take away message number 1: Results means impact, so only impact evaluation can tell us if we are achieving results. Results are not captured by outcome monitoring
What is impact evaluation? Impact evaluations answer the question as to what extent the intervention being evaluated altered the state of the world = the (outcome) indicator with the intervention compared to what it would have been in the absence of the intervention = Yt(1) – Yt(0) We can see this But we can’t see this So we use a comparison group
What do we need to measure impact? Girl’s secondary enrolment in rural China The majority of evaluations have just this information … which means we can say absolutely nothing about impact
Before versus after single difference comparisonBefore versus after = 92 – 40 = 52 “scholarships have led to rising schooling of young girls in the project villages” This ‘before versus after’ approach is outcome monitoring, which has become popular recently. Outcome monitoring has its place, but it is not impact evaluation
Rates of completion of elementary male and female students in all rural China’s poor areas Share of rural children 1993 1993 2008 2008
Post-treatment comparison comparisonSingle difference = 92 – 84 = 8 But we don’t know if they were similar before… though there are ways of doing this (statistical matching = quasi-experimental approaches)
Double difference =(92-40)-(84-26) = 52-58 = -6 Conclusion: Longitudinal (panel) data, with a comparison group, allow for the strongest impact evaluation design (though still need matching). SO WE NEED BASELINE DATA FROM PROJECT AND COMPARISON AREAS
Take away message number 2: Impact evaluation requires a valid comparison group, and baseline data really help. So ex ante design is best
Comparison group: an identical group of individuals, or households, or firms, or sub-districts, but NOT subject to the programme. Where do we get the comparison group from?
RANDOMIZATIONRANDOMIZATIONRANDOMIZATION RANDOMIZATIONRANDOMIZATION RANDOMIZATION RANDOMIZATION RANDOMIZATIONRANDOMIZATION RANDOMIZATIONRANDOMIZATIONRANDOMIZATIONRANDOMIZATIONRANDOMIZATION RANDOMIZATION RANDOMIZATION
Random assignment of the intervention… Not the same as taking a random sample of the ‘treated’ Some examples….
Voter education (Rajasthan and Delhi) • Outcomes: voter turnout, vote share of incumbent, politician behavior, service delivery • Intervention: pre-election voter awareness campaigns (report cards) • Unit of assignment: 375 GPs, half to get intervention
Schooling and early marriage • Outcome: marriage, school attendance and attainment • Intervention: in-kind transfer for girl remaining in education and unmarried • Unit of assignment: village
Health-based education programs Eyeglasses Vitamin pills
Some different ways to randomize Pipeline Raised threshold
Overcoming resistance to randomization • There is probably an untreated population anyway • Need not randomly allocate whole programme just a bit • Exploit • Roll out • Raised threshold • Encouragement designs • Don’t need ‘no treatment’ control • RCTs are not unethical, spending money on programmes that don’t work is unethical
Take away message number 3: RCTs are possible in a large range of settings… though it is not the only way to conduct IE
Well designed IEs lead to more nuanced questions • E.g. conditional cash transfer second generation questions: • Conditions or not? • What sort of conditions? • Who to give money to? • How to give the money? • When and how often to give money?
Second generation questions: computer-assisted learning, CAL • Most cost effective number of children per computer? • What sort of software? • How much teacher training required? • What technological back up needed? • What age groups to target?
So conduct studies to get inform design to get better results
Take away message number 4: Impact evaluation is not just about what works, but why, where and at what cost, and offers insights on intervention design, and so delivers better results
Implications for results-based budgeting In principle can identify priority outcomes, and what interventions are most cost effective in achieving these outcomes, and so allocate budget to things that work This IS being done in some countries…
But it’s not happening in most • “Evaluation is not systematically embedded in the GoU’s management practices…Because evaluation addresses issues such as actual progress in attainment of program objectives, cost effectiveness, and value for money, it responds to some of the aspects of Uganda’s M&E system that are most critically lacking.” • “There has been a general tendency to monitor rather than evaluate.” (Sri Lanka) • “…the distortion found in most countries of an excess of monitoring and a dearth of genuine evaluation.” (World Bank)
And attribution is not addressed • “…M&E is not geared toward understanding causality and attribution between the stages of development change.” (Uganda) • “Furthermore, while national and provincial treasuries have emphasized an approach to collecting information that is based on logical framework (log-frame) results chain, they have not focused on attribution or causality.” (South Africa)
Recommendations • Review current M&E systems and how it aligns with requirements for “results” • Identify some priority areas for impact evaluation, and commission a small number of studies (both ex post and ex ante) • Start development of national framework to build systematic impact evaluation into M&E, and budgeting to ‘performance’ meaning results, meaning impact
Thank you Visit www.3ieimpact.org