1 / 14

Sampling and Power

Sampling and Power. Slides by Jishnu Das. Sample Selection in Evaluation. Population based representative surveys: Sample representative of whole population Good for learning about the population Not always most efficient for impact evaluation Sampling for Impact evaluation

chelsi
Download Presentation

Sampling and Power

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sampling and Power Slides by Jishnu Das

  2. Sample Selection in Evaluation • Population based representative surveys: • Sample representative of whole population • Good for learning about the population • Not always most efficient for impact evaluation • Sampling for Impact evaluation • Balance between treatment and control groups • Power  statistical inference for groups of interest • Concentrate sample strategically • Survey budget as major consideration • In practice, sample size is often set by budget • Concentrate sample on key populations to increase power

  3. Purposive Sampling: • Risk: We will systematically bias our sample, so results don’t generalize to the rest of the population or other sub-groups • Trade off between power within population of interest and population representation • Results are internally valid, but not generalizable.

  4. Type I and type II errors • Type I error: Reject the null hypothesis when it is true • Significance level  probability of rejecting the null when it is true (Type I error) • Type II error: Accept (fail to reject) the null hypothesis when it is false • Power  probability of rejecting the null when an alternative null is true (1-probability of Type II) • We want to minimize both types of errors • Increase sample size

  5. Survey - Sampling • Population: all cases of interest • Sampling frame: list of all potential cases • Sample: cases selected for analysis • Sampling method: technique for selecting cases from sampling frame • Sampling fraction: proportion of cases from population selected for sample (n/N)

  6. Sampling Frame • Simple Sampling – almost never practical unless universe of interest is geographically concentrated • Cluster Sampling – randomly choose clusters and then randomly choose units within the cluster. Effective sample size is less than actual number of observations. This is the design, or cluster, effect • The design effect implies that, for a given sized sample, the variance increases [1 + (E-1)] where E is the number of elements in each cluster and  is the intra-class correlation, a measure of how much the observations with in a cluster resemble each other.

  7. Using Power Calculations to Estimate Sample Sizes • What is the size sample needed to be able to find a difference in means at a given statistical significance. • Need idea of what difference is a plausible expectation for the intervention. • Fixing the confidence level, we observe two things when increasing sample size: • the rejection region gets larger and • the power increases

  8. In Practice - I • Many sample patterns possible especially when one can vary cluster numbers and cluster sizes • May use simulations in Stata or similar package. They easily account for complicated designs • Panel and dif-in-dif calculations need to be based on ability to find significance of changes, not difference in levels. Requires an estimate of correlation over rounds • Sample needed to find difference between alternative treatments is different than that needed to compare to control

  9. In Practice - II • Number of clusters improves precision and is important especially in randomized designs. • Not strictly necessary that treatment and control are equal in size or number of clusters but analysis is complicated if probability of selection differs. • Importance of transparency in randomization process • Many medical journal require registering trials prior to analysis (to avoid reporting only ‘favorable’ results).

  10. An Example • Does Information improve child performance in schools? (Pakistan) • Randomized Design • Interested in villages where there are private schooling options • What Villages should we work in? • Stratification: North, Central, South • Random Sample: Villages chosen randomly from list of all villages with a private school

  11. In Practice: An Example • How many villages should we choose? • Depends on: • How many children in every village • How big do we think the treatment effect will be • What the overall variability in the outcome variable will be

  12. In Practice: An Example • Simulation Tables • Table 1 assumes very high variability in test-scores. • X,Y: X is for intervention with small effect size; Y for larger effect size • N: Significant < 1% of simulations • S: Significant < 10% of simulations • A: Significant > 99% of simulations

  13. In Practice: An Example • Simulation Tables • Table 1 assumes lower variability in test-scores. • X,Y: X is for intervention with small effect size; Y for larger effect size • N: Significant < 1% of simulations • S: Significant < 10% of simulations • A: Significant > 99% of simulations

  14. When do we really worry about this? • IF • Very small samples at unit of treatment! • Suppose treatment in 20 schools and control in 20 schools • But there are 400 children in every school • This is still a small sample • IF • Interested in sub-groups (blocks) • Sample size requirements increase exponentially • IF • Using Regression Discontinuity Designs

More Related