630 likes | 761 Views
OVERVIEW OF SAMPLE SURVEYS. Mehdi Nassirpour,Ph.D. Illinois Department of Transportation. This presentation was part of the Applied Sampling Workshop at the Annual TRB Conference in Washington DC in January 2004. HOW GOOD MUST THE SAMPLE BE?.
E N D
OVERVIEW OF SAMPLE SURVEYS Mehdi Nassirpour,Ph.D. Illinois Department of Transportation This presentation was part of the Applied Sampling Workshop at the Annual TRB Conference in Washington DC in January 2004.
HOW GOOD MUST THE SAMPLE BE? • There is no uniform standard of quality that must be reached by every sample. • The quality of the sample depends entirely on the stage of the research and how the information will be used. Division of Traffic Safety at IDOT
CURRENT POPULATION SURVEY • CPS is a monthly survey of households. • It provides data on the labor force, employment, unemployment, and persons not in the labor force. • This is a precise and controlled sample since it is the only source of monthly estimates of total employment and unemployment. • The sampling error for this kind of sample is about 0.1 percent Division of Traffic Safety at IDOT
PUBLIC PERCEPTION OF ILLINOIS SAFETY BELT USE • A sample of 500 Illinois residents over 18 years of age were selected. Although to achieve equal sample reliability, the sample size for a state or local geographic area would need to be virtually as large as if the study were a national sample of the US, one generally finds that local samples are smaller. That is, although the public attitudes toward safety belt issues are as important, the level of research funds available is smaller for a state than for a national study. Division of Traffic Safety at IDOT
INAPPROPRIATE SAMPLE DESIGN • Whether or not a sample design is appropriate depends on how it is used and the resources available. It may be fair to say that the sample generalizations made from the sample go too far. Division of Traffic Safety at IDOT
WHAT IS THE APPROPRIATE SAMPLE DESIGN? • DEGREE OF ACCURACY • RESOURCES • TIME • ADVANCED KNOWLEDGE OF THE POPULATION • NATIONAL VERSUS LOCAL • NEED FOR STATISTICAL ANALYSIS Division of Traffic Safety at IDOT
SMALL-SCALE SAMPLE WITH LIMITED RESOURCES • Generalizability • Sample size • Too small for a meaningful analysis • Adequate for some but not all major analyses • Adequate for the purpose of study • Sample Execution • Poor response rate • Careless field work • Use of resources Division of Traffic Safety at IDOT
Stages in the Selection of a Sample Define the target Population Select a sampling frame Determine if probability or non-probability sampling will be chosen Plan procedures for selecting sampling units Determine sample size Select actual sampling units Conduct field work
TARGET POPULATION • RELEVANT POPULATION • OPERATIONALY DEFINE Division of Traffic Safety at IDOT
DEFINING POPULATION 1.DEFINITION OF TARGET POPULATION • Complete set of individuals from which information is collected • TARGET AREA • Entire region or set of locations from which information is collected • Example: define population for a study of elderly in Springfield, IL • How will you distinguish the elderly from the non- elderly? • Will the elderly be defined by occupational categories? Do you want retired people? Or do you want persons over 65 and retired? Division of Traffic Safety at IDOT
SAMPLING FRAME • A LIST OF ELEMENTS FROM WHICH SAMPLE MAY BE DRAWN • WORKING POPULATION • MAILING LIST--DATABASE • SAMPLING FRAME ERROR Division of Traffic Safety at IDOT
SAMPLING FRAME (Examples) CONSTRUCTION OF OPERATIONAL SAMPLING FRAME • List of all subjects in the population • Specific definition of population • Wish to have a sampling frame that is almost or exactly identical to the entire population • Example: use of telephone surveys of voter preferences for political parties • Population of interest: all voters • Sampling frame: all voters with a telephone and who answer it • SAMPLED POPULATION – set of all individuals contained in the sampling frame, from which the sample is actually taken. • SAMPLED AREA – set of all locations within the study area boundary line that delimits the spatial sampling frame, from which the sample is drawn Division of Traffic Safety at IDOT
SAMPLING UNITS • GROUP SELECTED FOR THE SAMPLE • PRIMARY SAMPLING UNIT (PSU) • SECONDARY SAMPLING UNIT • TERTIARY SAMPLING UNIT Division of Traffic Safety at IDOT
SAMPLING ERRORS • SAMPLING FRAME ERROR (STUDY DESIGN) • RANDOM SAMPLING ERROR (SAMPLING VARIABILITY) • NONRESPONSE ERROR (MEASUREMENT BIASES) Division of Traffic Safety at IDOT
RANDOM SAMPLING ERROR • DIFFERENCE BETWEEN THE SAMPLE RESULT AND THE RESULT OF A CENSUS CONDUCTED USING IDENTICAL PROCEDURES • STATISTICAL FLUCTUATION DUE TO CHANCE VARIATIONS Division of Traffic Safety at IDOT
SYSTEMATIC ERRORS • NONSAMPLING ERRORS • UNREPRESENTATIVE SAMPLE RESULTS • NOT DUE TO CHANCE • DUE TO STUDY DESIGN OR IMPERFECTIONS IN EXECUTION Division of Traffic Safety at IDOT
SOURCES OF N0N-SAMPLING ERRORS • Under-representation • poor, homeless, prison inmates • opinion polls over telephones will miss 6% of population that do not have phones • Non-response • when selected individuals are not contacted or do not respond • usually 30% • results in bias • Interviewing skills - important not to introduce bias • types of questions asked • attitude during interviewing • wording of questions - confusing, misleading, intimidating
SOURCES OF SAMPLING ERROR • Inadequate sample size • The smaller the sample, the more difficult it will be for that sample to truly capture the characteristics of a population • Imprecise sample/results • The larger the sample, the better • But, collecting large samples costs money and resources • In reality, a balance needs to be struck between collecting extensive samples and spending a lot of money and resources and saving money but not having enough data to draw conclusions from Division of Traffic Safety at IDOT
Relationship Between Total Error and Sampling and Non-Sampling Errors Sampling Error Total Error Non-sampling Error Division of Traffic Safety at IDOT
TWO TYPES OF SAMPLING • PROBABILITY SAMPLING • NONPROBABILITY SAMPLING Division of Traffic Safety at IDOT
NONPROBABILITY SAMPLING • CONVENIENCE • JUDGMENT • QUOTA • SNOWBALL Division of Traffic Safety at IDOT
PROBABILITY SAMPLING • SIMPLE RANDOM SAMPLE • SYSTEMATIC RANDOM SAMPLE • STRATIFIED SAMPLE • CLUSTER SAMPLE • MULTISTAGE RANDOM SAMPLE Division of Traffic Safety at IDOT
CONVIENCE SAMPLING • Obtaining a sample of people or units that are most convenient. Division of Traffic Safety at IDOT
JUDGMENT SAMPLING • Selecting a sample based on judgment of an individual about some appropriate characteristics required from the sample member. Division of Traffic Safety at IDOT
QUOTA SAMPLING • Requires that the various subgroups in a population are represented . • It should not be confused with stratified sampling. Division of Traffic Safety at IDOT
SNOWBALL SAMPLING • Requires additional respondents are obtained from information provided by the initial sample of respondents. Division of Traffic Safety at IDOT
JUDGMENT SAMPLING • Selecting a sample based on judgment of an individual about some is appropriate. Division of Traffic Safety at IDOT
SIMPLE RANDOM SAMPLE • A sampling procedure that ensures that each element in the population will have an equal chance of being included in the sample. Division of Traffic Safety at IDOT
HOW TO CHOOSE RANDOM SAMPLE • Assign each element within the sampling frame a unique number (1-99). • Identify a random start from the random number table. • Determine how the digits in the random number table will be assigned to the sampling frame. • Select the sample elements from the sampling frame. Division of Traffic Safety at IDOT
SYSTEMATIC RANDOM SAMPLE • Identify the total number of elements in the population • Identify the sampling ratio K/n (K=total population size/n=size of desired sample) • identify the random start. • Draw a sample by choosing every kth entry Division of Traffic Safety at IDOT
EXAMPLE OF SYSTEMATIC RANDOM SAMPLE Division of Traffic Safety at IDOT
STRATIFIED RANDOM SAMPLE • Sub-samples are drawn within different strata. • Each stratum in more or less equal on some characteristics. Division of Traffic Safety at IDOT
REASONS FOR STRATIFIED RANDOM SAMPLE • Make a sample more efficient since variance differs between the strata. • Reduce sampling error between strata. • Reduce number of cases required in order to achieve a given degree of accuracy. Division of Traffic Safety at IDOT
TYPES OF STRATIFIED RANDOM SAMPLE • Proportionate Stratified Random Sample • Disproportionate Stratified Random Sample Division of Traffic Safety at IDOT
PROPORTIONATE STRATIFIED RANDOM SAMPLE • It is used to get a more representative sample than might be expected under SRS. • Reduces sampling errors between strata with respect to the relative numbers selected. This is true when we have homogeneous groups. • Population strata must be known in order to draw a proportionate stratified sample. Division of Traffic Safety at IDOT
DISPROPORTIONATE STRATIFIED RANDOM SAMPLE • It is used to manipulate the number of cases selected in order to improve efficiency of the design. • The main interest is to study separate sub-populations represented by the strata rather than on the entire population Division of Traffic Safety at IDOT
TYPICAL EXAMPLES OF STRATIFIED RANDOM SAMPLE • More popular examples are demographics, Age, Gender, Race, Region, Road type, Urban/Rural. Division of Traffic Safety at IDOT
WEIGHTING THE SAMPLE • Reason for weighting is to correct problems associated with sample bias (sampling and non-sampling ). • Known Sampling biases, such as household selected by random digit dialing will have more than one phone number. Division of Traffic Safety at IDOT
WEIGHTING PROCESS • Assign a weight that is equal to the inverse of its probability of selection. In this case, where all sample elements have had the same chance of selection, given the same weight: 1. (This is called self-weighting sample) Division of Traffic Safety at IDOT
WEIGHTING EXAMPLE Nonwhite Female weight =7.2/12.3=0.59 Nonwhite Male weight =3.8/9.8=0.39 White Female weight = 57.7/56.7=1.02 White Male weight = 31.2/21.2=1.47 Division of Traffic Safety at IDOT
Computation (Estimates of Means, and standard Errors) for Stratified Sample • Compute values for each strata and then weight them based on the relative size of the stratum in the population. Division of Traffic Safety at IDOT
WEIGHTING FORMULA Division of Traffic Safety at IDOT
Data for Computing Parameter Estimates from Stratified Samples
Estimated Standard Errors County 1: County 2: County 3: Division of Traffic Safety at IDOT
Estimated Mean and Variance 2 Division of Traffic Safety at IDOT
CLUSTER SAMPLING • Divide population into a large number of groups, called clusters and then sample among clusters. Finally select all individuals within those clusters. • The main reason for cluster sampling is to sample economically while retaining the characteristics of a probability sample. Division of Traffic Safety at IDOT
TYPES OF CLUSTER SAMPLING • Single -Stage Cluster sampling--Divide population into several hundred census tracts and then select 40 tracts as a sample. Then select every individuals within selected census tracts. • Multistage Cluster Sampling--Take a random sample of census tracts within a city. Then within each selected census tract we take a simple random sample of blocks (smaller clusters). Finally we might select every third house and interview every second adult within each of these households Division of Traffic Safety at IDOT
CLUSTER SAMPLINGProbability Proportionate to Size (PPS) • Arrange clusters in a desire order (not necessarily by size) • Obtain the size data • Sum up the size measures over clusters • Determine sampling interval • Select a random start Division of Traffic Safety at IDOT
Difference Between Cluster Sampling and Stratified Sampling • Although both types of sample involve divide population into groups, they involve in a opposite sampling operations. • In a stratified sample, we sample individuals within every stratum. The sampling errors involve variability within strata. Strata are supposed to be homogeneous as possible and as different as possible from each other. • In (single-stage ) cluster sampling, we have no source of sampling error within the clusters because every case is being used. The variability is between the clusters. Division of Traffic Safety at IDOT