200 likes | 393 Views
What could go wrong?. Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop Africa Program for Education Impact Evaluation (APEIE) Accra, Ghana May 10-14 2010. What could go wrong?. A lot!
E N D
What could go wrong? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop Africa Program for Education Impact Evaluation (APEIE) Accra, Ghana May 10-14 2010
What could go wrong? • A lot! • Threats to internal and external validity • Randomization is undermined • Hawthorne effect • Spillover effects • Attrition • Non-compliance • Pilot phase “startup” problems • … Let’s just focus on a few areas
Just a reminder • Internal validity • Extent to which the evaluation is indeed estimating the parameter of interest • E.g. Impact of a scholarship program on attendance • External validity • Extent to which the evaluation provides relevant information about the likely effectiveness of a program in a different setting, or if implemented at a different scale
Internal validity • Internal validity • Extent to which the evaluation is indeed estimating the parameter of interest Are you estimating what you think you are? • Some issues • Attrition • Spillover effects • Partial compliance and sample selection bias
Internal validity: Attrition • Attritionis a generic problem in data analysis • Some people drop out of the analysis or the sample • Is this a problem in impact evaluation? Yes, if the attrition is related to with the intervention, or its likely impact.
Attrition bias: An Example • Study: Impact of a scholarship program on learning achievement • Baseline = students in grade 5 • Intervention = offer of a scholarship for grade 6 • Follow-up = test scores of students at end of grade 6 • Impact evaluation question: • Did the scholarship program improve test scores? • Can you see a problem with this setup? • How might you go about addressing any potential bias?
Addressing attrition bias • Check that attrition is not different between treatment and control groups • Also check that it is not correlated with observables. • Try to bound the extent of the bias • For example: • Recalculate impact assuming everyone who dropped out from the treatment got the lowest score that anyone got • Recalculate impact assuming everyone who dropped out of control got the highest score that anyone got
Internal validity: Spillover effects/externalities • Spillover effects or Externalities occur when other people than the target population are affected by the intervention • How is this a problem?
Spillover effects/externalities: Example • A teacher training program affects all teachers in a school, not just those who were randomly selected to go through the training
Addressing spillover effects/externalities • Main way to address problem is to • Randomize in such a way as to encompass externalities • E.g. • Randomize training to all teachers in a school
Internal validity: Partial compliance and sample selection bias • Occurs when the treatment and control groups are not comparable, either • Because the randomization “didn’t work” • Study population’s behavior “undermined” the randomization
Randomization “didn’t work” • The validity of randomization as an approach to ensure equal characteristics depends on sufficiently large samples • In any single application the samples could be different • That is, average characteristics might differ between treatment and control groups • How might you address this in the analysis?
Behavior “undermined” the randomization • For example, • Example 1: Students who were offered a scholarship drop out after 1 year (and could therefore be considered “control” children) • Example 2: Students from non-recipient schools move to schools that were randomly chosen to receive a grant • How might you address this in the analysis?
Addressing partial compliance and sample selection bias • Use the “Intent To Treat” (ITT) approach • Frame the evaluation in terms of the original design • For example • Example 1: Study the impact of offering a scholarship on outcomes • Example 2: Study the impact of the grants program in terms of the school they were originally enrolled in
Addressing partial compliance and sample selection bias • The “Intent To Treat” approach is powerful, but does raise issues of its own • It often captures what is in the control of those implementing the intervention • E.g. offering a scholarship • But it may not reflect what is likely to happen if the program goes to scale • E.g. if all schools were in the grants program then students wouldn’t switch • One can use the results to estimate the “Average Treatment on the Treated” but this requires further modeling
Externalvalidity • Extent to which the evaluation provides relevant information about the likely effectiveness of a program in a different setting, or if implemented at a different scale Are the findings be meaningful for policy?
External validity: Behavioral responses • The behaviors of the study population may be affected by the study itself (as opposed to the intervention) • Treatment group behavior changes: Hawthorne effect • People in the treatment group are being closely observed and studies, so alter their behavior (e.g. teachers being observed). • Control group behavior changes: John Henry effect • People in the control group views themselves as being in competition with the treatment group and so changes their behavior (e.g. students denied a scholarship).
External validity: Generalizability • Is the program—as evaluated—truly possible to replicate at scale? • Did the intervention require a lot of careful attention in order to make it work • Was the evaluation carried out on a truly representative population? • Was it restricted to a province that was “ready” for the intervention … and therefore not like other places it might be carried out in
Conclusion • Many “threats” to both internal and external validity • Some may come from pressure • From study population • From higher levels of government • From donors • The “technocratic” approach becomes something of an “art” in needing to balance these various goals