100 likes | 260 Views
Sample Design. Part 2. Produced in Collaboration between World Bank Institute and the Development Data Group (DECDG). Sample Size. Sample size must be selected to meet: reliability criteria cost constraints Steps to determine sample size
E N D
Sample Design Part 2 Produced in Collaboration between World Bank Institute and the Development Data Group (DECDG)
Sample Size Sample size must be selected to meet: reliability criteria cost constraints • Steps to determine sample size 1. Set reliability criteria and budget constraints 2. Develop an equation to express the relationship between reliability and sample size 3. Develop an equation to express the relationship between cost and sample size 4. Select sample size that satisfies both equations
CV, a Measure of Reliability • CV – Coefficient of Variation • Relative measure of an estimate’s accuracy • Generally expressed in %; e.g., 10% CV • Better to have small CVs than large CVs
Estimated Standard Error se(x) • Square root of the estimated variance • Often compared to a simple random sample of size n from a population N • Design effect de accounts for differences of a particular sample from the “ideal” of simple random sampling
Estimated Standard Error se(x) • For a proportion p, the parameter s2 takes the simple form np(1-p)/(n-1) • A “safe” design effect de=4 can be useful
Factors Affecting Sample Size • National & Subnational Reliability • Response Rates • Clustering (households sampled, not persons, can sample clusters of households) • Type of Sampling (Stratification? 2-stage? Are first-stage areas small or large?) • Advanced Techniques • Sample Rotation • Population controls in estimation
Sample Size Formulas • In previous slide, large letters that represent assumed “known” values replace estimates • S2 = NP(1-P)/(N-1) in second formula for a proportion P • P is a proportion with respect to N, for ex: • UE/N = unemployment /(adult population) • NOT the unemployment rate UER • UER = UE/(labor force)
Final Sample Size Determination • Several variables of interest? • National and subnational reliability criteria? • If the sample size is too costly, then rethink • Relax reliability criteria? • Increase allowed cost?