290 likes | 614 Views
Introduction to Sampling for the Implementation of PATs. Materials Developed by The IRIS Center at the University of Maryland. Advantages of Sampling. In most cases, do not want to survey EVERYONE Why? Too costly Too time consuming Too many resources needed. Advantages of Sampling.
E N D
Introduction to Sampling for the Implementation of PATs Materials Developed by The IRIS Center at the University of Maryland
Advantages of Sampling In most cases, do not want to survey EVERYONE Why? • Too costly • Too time consuming • Too many resources needed
Advantages of Sampling • To make our work more cost-effective: • Interview the minimum number needed • Reduce: • time • cost • human error
Survey Sampling According to sampling theory we can get valid results from studying only a fraction (a sample) of our clients, provided: • the sample is REPRESENTATIVE of the qualities of our client POPULATION, and • of sufficient SIZE to satisfy the assumptions of the statistical techniques used in our analysis
Simple Random Sampling For the sample to be representative, it must be obtained randomly. It is a simple random sample if each item in the population has an equal chance of being selected.
Types of Bias in Survey Process Poor randomization is not the only cause of biased samples. Bias and error are more often introduced by: • poor group definition • interviewer error • inadequate records (incomplete or outdated client lists).
Longitudinal Design Longitudinal studies compare multiple clients at multiple points in time (at least two points in time). Often there is a baseline (when the client began the program) and an endline (two years later, for example).
Cross Sectional Design Cross sectional studies compare multiple clients in the program at one point in time. Ex: On October 1, 2005, program looks at: • Incoming clients • 2-year clients • 4-year clients
Calculating Sample Size: How Big is Big Enough? • Sample results are almost never identical to the • entire population • The larger the sample of clients, the greater the • likelihood that the statistical analysis will yield • “significant” results that closely resemble the • entire client population.
Calculating Sample Size Different Views: • Statistician – maximalist – at least 500 • Field researcher – minimalist – at least 35 to 50 for each subgroup we want to analyze and compare USAID PAT – at least 300
Trade off: Larger sample is more accurate, but costs more in time and money To make generalizations about entire population, need a total sample size of 200-400 (depending on total population and confidence level desired)
Sample Size Calculator • Creative Research Systems: www.surveysystem.com/sscalc.htm
How to Sample Randomly? • RANDOM = giving each client an equal chance to be selected • This is done by: • drawing numbers, as in a lottery • numbering all clients and selecting numbers from a • random number table • systematically, by selecting every ‘nth’ case from a • complete list of clients • DANGER!!! The list may be biased by: • who is left out—Is the list up-to-date?
Steps in Taking a Simple Random Sample • Number a copy of the complete client list, and note the total number of clients (the last number) • Decide on your sample size • Create a list of random numbers • Use Excel or a random number table to select the sample, matching the numbers from the table with those on your numbered client list.
Cluster Sampling To focus on specific subgroups, first classify the population into several subpopulations, called “strata,” then randomly sample from each stratum (subgroup).
Cluster Sampling Is a way of selecting randomly, when you have a geographically dispersed population when time is limited. This method can help reduce the time and cost in data collection. Group the clients into clusters (could be branches or loan groups). Randomly choose the clusters. Then sample random individuals from only some randomly chosen clusters.
Stratified Sampling • Stratified survey sampling enables you to focus on specific groups (for example, women or rural people), ensuring that they will be represented in the sample. Although random survey sampling, done correctly, will give the researcher roughly proportional samples of all groups, disproportional stratified sampling will guarantee that a certain group is adequately represented.
Parametric Statistics • Assumes that the distribution of values for your variables are normal (Bell Curve), and also relatively similar to each other. • In parametric statistics, thirty is a “magic minimum number”--meaning that it is generally accepted as the minimum cell size for each stratum or subgroup of a simple sample.
Minimum for Each Subgroup • 30 = ‘minimum magic number’ for each subgroup • To do any statistical analysis between subgroups, need a minimum of 30 in each subgroup in order to have any chance at all of finding ‘significant’ differences. BUT, 30 is NOT enough for your total sample.
If you want to compare between subgroups, you need 35 in each cell • Since the magic minimum number is 30, and you may have some missing values in some of your interview forms, for practical purposes, you need to always have a minimum number of 35 completed surveys for each cell of the sampling frame.
Handling Sampling Problems in the Field • If you cannot interview the client who is sampled • (not available, refusal, etc.) • Sample ‘at least’ an extra 40% and have • alternates available to be interviewed in each • area (subgroup) • Help ensure that you complete 35 • questionnaires for each subgroup (if you plan to • do additional analysis and compare subgroups) • Make better use of the interviewers’ time
What if there are not enough with the 40% extra? • Check with the sample tracking coordinator to give you new • names • B. If there is not time, the field supervisor must adjust in the • field • 1) Use random number table and select clients from master • list that have not already been selected • 2) If you do not have a random number table, can ask • someone to pick a number between # and ## at random • Do NOT introduce bias • 3) Write down the changes that you made and how you did it
An excerpt from a Random Number Table 32 50 92 46 24 69 48 93 77 87 47 17 29 36 55 81 34 70 46 99 27 95 04 69 59 71 30 74 42 36 45 11 49 20 50 86 16 75 80 55 33 98 93 66 76 13 56 08 38 43 12 11 01 21 41 13 87 08 47 98 64 61 65 94 30 17 51 54 45 85 41 22 96 26 64 38 09 93 01 49 43 06 09 24 42 23 23 21 65 14 95 76 09 00 24 54 15 04 34 41 58 61 05 09 82 97 30 78 89 23 44 66 18 71 83 08 21 74 18 91 Can use: www.random.org/nform.html
IF YOU DON’T HAVE A CLIENT LIST • Random walk sampling -- less expensive but more prone to bias • Watch out for “tarmac bias”, selecting only • houses that are easily accessible from the • road
Example of BDS Sampling • Investigative emphasis: final beneficiaries. • Will use three subsectors (irrigation, cashews, potable water). • Will focus only on end users of the technologies. • Will focus on region surrounding Ziguinchor.
Example of Business Development Services Sampling • Sample size = 200 • Casamance region is the focus • Program has three sub-sectors • Sample in each sub-sector stratified according to major differences between types of clients Irrigation – individual owners and group owners Cashew processing – shellers and peelers Potable water – tubewells and rope pumps (rural and peri-urban)
Example of BDS Sampling • Generate a list of the direct clients and divide by • subgroup • # of clients per stratum or subgroup depends on • percentage the stratum constitutes in sector • Select clients using a random number list • Each direct client will provide information to the • interviewers so that they can create a list of end • users from which some will be chosen according to • a predetermined random number list.