Analysis of complex sampling designs: A brief primer

Analysis of complex sampling designs: A brief primer Clyde Dent, PhD PSU quantitative interest group Brown Bag December 2009

What makes a survey design ‘complex’? • unequal probability of selection: some groups may be intentionally sampled at higher rates • clustering of observations: elements are intentionally sampled in intact groups

The Major Problems with analysis of complex survey designs Observations are not independent in complex surveys. Standard statistical analysis generally overestimates the precision of estimates Unequal selection probabilities needs weighted analysis

Some definitions in survey sampling: Complex surveys: A survey implementation where sample elements are not drawn by simple random sampling method. Population: The entire set of individuals about which findings of a survey refer to. Sample: A subset of population selected for a study. Sample Design: The scheme by which items are chosen for the sample.

Identifying the Sample Frame • Sampling units or elements - The “things” that data are collected data about • A sampling frame is a list of units or elements that defines the target population. • A sampling frame should cover all of the target population.

Strategies in sample selection • Census • Probability Sample • Non-Probability samples

Probability Sampling • All members of a population have some chance (or non-zero probability) of being sampled • Employs random selection

Survey Designs Three Basic Designs: Simple random sampling Stratified sampling Cluster sampling Two methods of conducting surveys: Single-stage sampling plans Multi-stage sampling plans

Simple Random Sampling • In addition to having a non-zero chance of being selected, each element of the population has an equalchance of being selected • Each element is selected independently

Simple Random Sampling • Ease of analysis • Need an exhaustive sample frame • Typically need large samples to get a desired precision in estimates • Can be high cost • Can take more time

Stratified Random Sampling • Divide population into various “strata” or subgroups • Randomly sample within these strata • Strata may be geographical areas, races or ethnicities, socioeconomic classes, etc…

Stratified Random Sampling • Can increase precision of estimates • Allows for differential sampling for distinct sub-pops (over-sampling) • May decrease costs • Accomplishes two tasks: • Makes sample more representative of population • Controls for confounding effects of the stratification criteria

Cluster Sampling • Identify “clusters” within a population • Counties, nursing homes, factories, etc… • Randomly sample these clusters • Survey a census of individuals in each sampled cluster (single-stage) • The major difference in this technique is that the primary sampling element is the cluster, not the individual

Cluster Sampling • Useful when a sampling frame of individuals is difficult to get or does not exist • Lowers cost • Travel • Supervision • Decreases time • Good when elements w/in cluster are heterogeneous • Loss of precision in estimates

Multi-Stage Cluster Sampling • Break population into clusters • Take random sample of clusters (stage 1) • From this sample take random sample of individuals (stage 2)

Multi-Stage Cluster Sampling – Terms and Examples • Primary Sampling Unit (PSU) • The first set of clusters identified and sampled • Secondary Sampling Unit (SSU) • The second or sub-set of clusters identified and sampled • Example: Population is the entire adult U.S. population • PSU’s may be all U.S. counties • SSU’s may be ZIP codes within selected counties • A sample of individuals from within selected ZIP codes of selected counties from within the U.S. is taken as the final sampling unit (FSU)

Stratified Multi-Stage Cluster Sampling • Same as multi-stage cluster sampling, except… • Stratify PSU’s prior to initial sampling • In last Example: Stratifying counties into three strata – Urban, Suburban and Rural

Weighting

What is a Survey Weight? • A value assigned to each case in the data file. • Normally used to make estimates computed from the data more representative of the population. • E.g., the value indicates how much each case will count in a statistical procedure. • Examples: • A weight of 2 means that the case counts in the dataset as two identical cases. • A weight of 1 means that the case only counts as one case in the dataset. • Weights can (and often are) fractions, but are always positive and non-zero.

Types of Survey Weights • Two most common types: • Design Weight: • Normally used to compensate for over- or under-sampling of specific cases or for disproportionate stratification. • Post-Stratification Weight. • This type is used to compensate for fluctuations in sampling on important non-design characteristics (ie, age).

Calculating Design Weights If we know the sampling fraction for each case, the weight is the inverse of the sampling fraction. Design Weight = 1/(sampling fraction) The sampling ”fraction” could be over 1.0 Example: If we over-sampled African Americans at a rate 4 times greater than the rate for Whites, than the design weight for an African American would be ¼ of that for a White respondent.

Calculating Post-Stratification Weights • This is normally more difficult then design weights. • It requires the use of auxiliary information about the population • may take a number of different variables into account. • Information usually needed: • Population estimates of the distribution of a set of demographic characteristics that have also been measured in the sample • For example, information found in the Census such as: • Gender, Age, Educational attainment, • Household size, Residence (e.g., rural, urban), Region

Problems with Weights • Weights primarily adjust means and proportions. OK for descriptive data but may adversely affect inferential data and standard errors. • Weights almost always increase the standard errors of your estimates. Introduce imprecision into your data. • Very large weights (or very small ones) can also introduce imprecision.

Variance estimation

Variance Estimation • Sampling variation depends on the estimator, sample design and sample size • Many researchers believe it depends only on sample size • Standard variance formulae available for most analysis methods • Typically assume SRS • However these formulae do not work for the sample designs used in most complex surveys

Classical Approaches • Variance formula have also been developed for some estimators under a wide range of sample designs • See books by authors such as Cochran and Kish • 1950’s to 1970’s • Design effect • Ratio of actual variance to variance assuming SRS of same size • Typically varies from one item to the next • Usually under 2 for well designed surveys, but sometimes more • Can be much higher for other surveys – e.g. 25 for some items in AK BRFSS

Design Effect (deff) Vc = variance of an estimator from a complex design Vsrs = variance of an estimator from a simple random sample of the same size Deff = Vc / Vsrs

Effective Sample Size n = actual sample size n’ = ‘effective sample size’ i.e size of a simple random sample with the same variance n’ = n/deff Example n = 10,000 deff = 2 n’ = 10,000/2 = 5000

What Affects Design Effects? • Stratification – can reduce deffs • Clustering – generally increases deffs • Unequal probabilities of selection and Post Stratification leading to unequal weights – increases deff • In complex design the total deff is product of these

General Methodsfor Variance Estimation • Variance formula may already be available • If not, there are several general methods for variance estimation for complex surveys • Linearization • Resampling methods • Balanced repeated replication (BRR) • Jackknife • Bootstrap

Test adjustments • Pearson Chi-Square • Difference between observed and expected freq • Based on SRS, too liberal, esp if obs N is used • May be OK if using effective n • Rao-Scott Chi-Square • Adjusts Pearson for design effects • Uses observed cell proportions to adjust • Modified Rao-Scott / Wald • design correction uses the null hypothesis proportions to adjust • Adjusted F/ Wald log linear • Corrects for test instability w/small number of clusters

Analysis of complex sampling designs: A brief primer