150 likes | 274 Views
Pre-analysis plans. Module 8.3. Recap on statistics. If we find a result is significant at the 5% level, what does this mean? there is a 5% or less probability it is the result of chance
E N D
Pre-analysis plans Module 8.3
Recap on statistics • If we find a result is significant at the 5% level, what does this mean? • there is a 5% or less probability it is the result of chance • If we test 10 independent hypotheses there is a 40% chance we will fail to reject the null at the 5% level at least once • i.e. 40% chance find one hypothesis is significant at the 5% level
Publication bias: what is it? • Evaluations with positive or negative significant impacts more likely to get published • If enough evaluations are run on a given type of program, some will give positive result by chance • Those with a vested interest may even deliberately run many studies and publicize only some of the results • Published results will suggest the program is effective even if it isn’t
Study registers as a solution • All studies are registered before the results are known • Shows how many studies examined a type of program • If 20 studies are started on a given question and only one positive study is published, we may worry the others found no result • Incentives for registration • Some medical journals require registration before study started for publication • New registries for RCTs in social sciences • E.g by the American Economic Association
Coverage of registries • Q: what are the advantages of having a registry that only covers RCTs? • Q: Is publication bias only a problem for RCTs? • If not why not have a registry for non RCTs? • Q: Should registration only be allowed before the start of a study?
Data mining: what is it? • Looking at the data many different ways, trying to find the result you want • If test impact of program on many different outcomes, some will show positive (or negative) impact by chance • If test impact of program on many different subgroups, some will show positive (or negative) impact by chance • We may be falsely accused of data mining • E.g. we test one subgroup and report the results but readers think maybe we tested many and only reported the one that was significant
Preanalysis plans as a solution • Write down in advance how the data will be analyzed • What outcomes are of primary interest • What subgroups will be examined • Register the plan with some external organization • When presenting results, show all those covered in the PAP • highlight any deviations from or additions to the PAP
When is a PAP most useful? • When a study has a large number of outcomes with no obvious hierarchy of which are the most important • When researchers know they are interested in differential impact on different subgroups (heterogeneous treatment effects) • When researchers are concerned others will push them to find positive impacts • When want to adjust statistical tests for multiple hypothesis tests
Incentives for immunization (no PAP) • Program to increase immunization in rural India with two arms • Predictable and reliable immunization camps • Camps plus 1kg lentils for each shot and set of plates on completion of full immunization • One main outcome (immunization), more than one indicator • Ever received any immunization • % with full immunization • Showing two indicators illustrated why the program worked • Regular camps increased children with at least one shot • Incentive increased % with full immunization • Suggests unwillingness to immunize not main barrier to full immunization, instead incentive helped maintain persistence
Results: number of immunizations Source: Banerjee et al 2010
Results fully immunized Source: Banerjee et al 2010
Community driven development (with PAP) • Evaluation of the impact of a CDD program including: • Quality and quantity of public goods • Social capital • Trust • Participation • Each outcome area had many indicators, eg many public goods • >300outcome indicators in total • Several important subgroups to examine: • Women and youth primary targets • Program implemented in two very different regions
When should a PAP be written? • Before the baseline? • Prevents us from learning about what questions have low response rates or inconsistent answers • Before the program starts? • prevents researchers taking advantage of random shocks which happen to mainly impact treatment or comparison • Also prevents researchers thinking of new hypotheses, e.g. unthought-of of negative consequences • Before looking at any data? • Can be useful to look at comparison data to determine appropriate control variables • drop variables where little chance of improvement (e.g. 95% of control already do)
What should be covered in a PAP? • Main outcome measures • Which outcome measures are primary and which secondary • The precise composition • Subgroups to be analyzed • Direction of expected impact if we want to use a one-sided test • Primary specification to be used • What control variables to include • How is the outcome variable specified, in logs, changes etc
Disadvantages of a PAP • Any analysis that is not included in the PAP has less credibility • Only do a PAP if you have the time to think it through carefully • Sometimes patterns in the data tell consistent stories we never thought of • We might want the flexibility to pursue these • With complex evaluations it can be hard to think through all outcome combinations and how analysis should proceed with each • We may do different secondary analysis if the impact is positive vs. negative • One option is to do PAP in stages, look at some data, then write another PAP