130 likes | 257 Views
Rigorous Impact Evaluation What It Is About and How It Can Be Done In Practice Alexandra Caspari, Frankfurt/Main Germany. Conference » Perspectives on Impact Evaluation: Approaches to Assessing Development Effectiveness « 31 st March – 2 nd April 2009, Cairo.
E N D
Rigorous Impact EvaluationWhat It Is About and How It Can Be Done In PracticeAlexandra Caspari, Frankfurt/Main Germany Conference »Perspectives on Impact Evaluation:Approaches to Assessing Development Effectiveness« 31st March – 2nd April 2009, Cairo
Historical Review – The Evaluation Gap • MDGs (2000), ‘Paris Declaration on Aid Effectiveness’ (2005), and ‘Agenda for Action’ (Accra, 2008): • Increasing attention to Impact Evaluations • Lack of knowledge about effectiveness of projects and programs • 2006: Report “When will we ever learn?” of the CGD ‘Evaluation Gap Working Group’ • gap in quantity and quality of impact evaluations: • too few impact evaluations are being carried out and • those conducted often unable to properly assess impact because of methodological shortcomings • Recommendation: ‘Collective Action’ • International Initiatives (NONIE, 3IE, …)
What is Impact Evaluation? • OECD/DAC (2002): “positive and negative, primary and secondary long-term effectsproduced by a development intervention, directly or indirectly, intended or unintended” • emphasiseson ‘produced by’: • measures impact with clear causation (causal attribution) • considers the counterfactual, i.e. the question “What difference did this program make?”“What would have happened without the intervention?” Rigorous Impact Evaluation (RIE): • Distinction against more “usual evaluations” by adding “rigorous” • focus on clear causation • use of adequate methods (to meet methodological shortcomings) • most important point: selection of the evaluation design to consider the counterfactual
The Counterfactual • Causal effect: An actual effect δi caused by a treatment T (a program) is the difference between the outcome Yi1 under a treatment T(T=1), i.e. program participant,minus the alternative outcome Yi0 that would have happened without the treatment T (T=0), i.e. non-participant: • Impact is not directly observable: • one can observe any given individual either as a treated person (participant) or untreated person (non-participant) but not both states • if individual iis participating in a program (T=1), then the outcome Yi0 is unobservable • this unobservable outcome Yi0 is called counterfactual • Analyzing the difference between the observed outcome and the unobserved potential outcome by choosing the best evaluation design
one-group pre-test post-test design (a) P measuredimpact impact indicator t1 t2 time ●: observation, P: participants (treated), t: time (first, second observation), X: project intervention Considering the Counterfactual • often used non-experimental designs: • measured impact = • the counterfactual is not considered! • with non-experimental designs causal attribution is not possible!
Considering the Counterfactual • necessary: experimental or quasi-experimental designs adequate comparison group (‘with-and without comparison’) • „Real“ Experiments / Randomized Controlled Trials (RCTs):(Laboratory)Experiments: • random assignment of individuals to treatment (P) and control group (C) groups differ solely due to chance • treatment and conditions are known/checkable Field experiments: • take place in real-world settings • anyhow treatment and control groups are assigned at random • Quasi-Experiments: • no random assignment • has a source of randomization that is “as if” randomly assigned • control group is often reconstructed ex-post
pre-test post-test control group design (1)/(2) one-group pre-test post-test design (a) static group comparison (4) P P P measuredimpact = Dt2 – Dt1 measuredimpact= Dt2 Dt2 measuredimpact impact indicator impact indicator impact indicator over- estimated impact C C Dt1 t1 t2 t1 t2 t1 t2 time time time (single difference) (double difference) ●: observation, P: participants (treaded), C: control group (non-treated), D: difference, t: time (first, second observation), X: project intervention Considering the Counterfactual
Approaches to Impact Evaluation • appropriate impact evaluation designs are often reject as unnecessarily sophisticated or because of ethical concerns • various realistic ways in which quasi-experimental designs can be introduced in an ethically and politically acceptable manner: • Matching on Observables • Regression Discontinuity • Propensity Score Matching (PSM) • Pipeline Approach • Multiple Comparison Group Design
Possible Approaches in Practice • Matching on Observables: • characteristics (access tor services, economic level, type of housing, etc.) on which the comparison group should match the program group (individuals, households or areas) are identified carefully • often easily observable or identifiable characteristics • unobservable differences has to be kept in mind • control group is build out of those individuals, households or areas which match best • quasi-experimental design “pretest-posttest-comparison with post-test non-equivalent control group” (3) or at least “static group comparison” (4) is possible single-difference (SD) possible
Possible Approaches in Practice • Regression Discontinuity: • if a program is assigned using a clear threshold for eligibility comprised for one ore more criteria (age, income less than…) • control group is built out of those just above the threshold and hence not eligible for the program • those individuals will have comparable characteristics • quasi-experimental design “pre-test post-test non-equivalent control group design” (2) possible! • double-difference (DD) possible!
Possible Approaches in Practice • Pipeline Approach: • if large programs (housing or community infrastructure, immunization, …) are introduced in phases over several years • when there are no major differences between the characteristics of families, communities scheduled for each phase and • when there is no selection criteria for participants of the first phase (the poorest families, communities, …) • participants of phase 2 & 3 = control group for participants phase 1 • quasi-experimental design “pre-test post-test non-equivalent control group design” (2) possible! • double-difference (DD) possible
Important Remarks • The international discussion about RIE refers just to a small aspect of evaluation: the causal attribution of impact • Impact is measured at the level of target groups/participants because target groups are typically large, for this evaluation step quantitative methods are necessary (representativeness vs. profundity) • other evaluation methods are not condemned! • causal attribution is necessary but not sufficient ‘black box’ remains: why does a program have impact (or does not) • comprehensive meaningful and reliable impact evaluations need the use of mixed method, i.e. use of quantitative and qualitative methods
Reference: Caspari, Alexandra/Barbu, Ragnhild (2008): Wirkungsevaluierungen Zum Stand der internationalen Diskussion unddessen Relevanz für die Evaluierung derdeutschen Entwicklungszusammenarbeit • http://www.fh-frankfurt.de/de/.media/~caspari/2008bmzwpwirkungsevaluation.pdf