1 / 20

Designing an impact evaluation :

Designing an impact evaluation :. Randomization , statistical power, and some more fun…. Designing a (simple) RCT in a couple steps. You want to evaluate the impact of something (a program, a technology , a piece of information, etc.) on an outcome .

lucien
Download Presentation

Designing an impact evaluation :

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Designing an impact evaluation: Randomization, statistical power, and some more fun…

  2. Designing a (simple) RCT in a couple steps • You want to evaluate the impact of something (a program, a technology, a piece of information, etc.) on an outcome. Example: Evaluate the impact of free schoolmeals on pupils’sschoolingoutcomes. • You decide to do itthrough a randomizedcontrolled trial. • Why? • The questions thatfollow: • Type of randomization – Whatismostappropriate? • Unit of randomization– What do weneed to think about? • Sample size > These are the thingswewill talk about now.

  3. I. Where to start • You have an HYPOTHESIS Example: Free meals => increasedschoolattendance => increasedamount of schooling => improved test scores. Or couldit go the otherway? • To test yourhypothesis, youwant to estimate the impact of a variable Ton an outcome Y for an individual i. In a simple regressionframework: • How couldyou do this? • Compare schoolswith free meals to schoolswith no free meals? • Compare test scores before the free meal program wasimplemented to test scores after? Yi=αi+βT+εi

  4. II. Randomization basics • You decidedto do use a randomized design. Why?? • Randomizationremoves the selectionbias > Trick question: Does the sampleneed to berandomlysampledfromthe entire population? • Randomizationsolves the causal inference issue, by providing a counterfactual = comparison group. Whilewecan’t observe YiT and YiCat the same time, wecanmeasure the averagetreatmenteffect by computing the difference in meanoutcomebetweentwo a priori comparable groups. Wemeasure: ATE=E[YT]- E[YC]

  5. II. Randomization basics • What to think of whendecidingon your design? • Types of randomization/ unit of randomization • Block design • Phase-in • Encouragement design • Stratification? The decisionshould come from (1) yourhypothesis, (2) yourpartner’simplementationplans, (3) the type of intervention! Example: Whatwouldyou do? • Nextstep: How manyunits? = SAMPLE SIZE. Intuition --> Why do weneedmany observations?

  6. Remember, we’reinterested in Mean(T)-Mean(C) Wemeasure scores in 1 treatmentschool and 1 control school > Can I sayanything?

  7. Now 50 schools:

  8. Now 500 schools:

  9. III. Sample size • But how to pick the optimal size? -> It all depends on the minimum effect size you’dwant to be able to detect. Note: Standardizedeffect sizes. • POWER CALCULATIONS link minimum effect size to design. • Theydepend on severalfactors: • The effect size youwant • Yourrandomizationchoices • The baselinecharacteristics of yoursample • The statistical power youwant • The significanceyouwant for yourestimates We’ll look intothesefactors one by one, starting by the end…

  10. III. Power calculations(1) Hypothesistesting • Whentrying to test an hypothesis, one actually tests the nullhypothesis H0against the alternative hypothesis Ha, and tries to reject the null. H0: Effect size=0 Ha: Effect size≠0 • Two types of error are to fear:

  11. III. Power calculations(1) Significance • SIGNIFICANCE= Probabilitythatyou’dconcludethatT has an effectwhen in factitdoesn’t. It tells youhow confident youcanbe in youranswer. (Denoted α) • Classical values: 1, 5, 10% • Hypothesistestingbasicallycomes down to testingequality of meansbetweenT and C using a t-test. For the effect to besignificant, it must bethat the t-stat obtainedbegreaterthan the t-stat of the significancelevelwanted. Or again: must begreater or equal to tα=1.96

  12. III. Power calculations(2) Power • POWER= Probabilitythat, if a significanteffectexists, youwillfindit for a givensample size. (Denotedκ) • Classical values: 80, 90% • To achieve a power κ, it must bethat: • Or graphically… • In short: To have a high chance to detect an effect, one needsenough power, whichdepends on the standard error of the estimate of ß.

  13. III. Power calculations(3) Standard error of the estimate • Intuition = the higher the standard error, the lessprecise the estimate, the more trickyitis to identify an effect, the higher the need for power! • Demonstration: How does the spread of a variable impact on the precision a meancomparisontest?? • Wesawthat power depended on the SE of the estimate of ß. But whatdoesthis standard errordepend on? • Standard deviation of the error (how heterogenous the sampleis) • The proportion of the population treated (Randomizationchoices) • The sample size

  14. III. Power calculations(4) Calculations • Wenow have all the ingredients of the equation. The minimum detectableeffect (MDE) is: • As youcansee: • The higher the heterogeneity of the sample, the higher the MDE, • The lower N, the higher the MDE, • The higher the power, the lower the MDE • Power calculationsin practice, will correspond to playingwith all theseingredients to find the optimal design to satisfyyour MDE. • Optimal sample size? • Optimal portion treated?

  15. III. Power calculations(5) More complicatedframeworks • Severaltreatments? • Whathappenswhen more than one treatment? • It all depends on whatyouwant to compare !! • Stratification? • Reduces the standard deviation • Clustered (block) design? • Whenusing clusters, the outcomes of the observations within a cluster canbecorrelated. Whatdoesthismean? • Intra-cluster correlation rhô, the portion of the total variance explained by within variance, implies an increase in overall variance. • Impact on MDE? • In short: the higher rhô, the higher the MDE (increasecanbe large)

  16. Summary • Whenthinking of designing an experiment: • Whatisyourhypothesis? • How manytreatment groups? • What unit of randomization? • Whatis the minimum effect size of interest? • What optimal sample size considering power/budget? => Power calculations !

More Related