260 likes | 411 Views
Journalism 614: Sampling. Sampling. Probability Sampling Based on random selection Non-probability sampling Based on convenience. Sampling Miscues: Alf Landon for President (1936). Literary Digest: post cards to voters in 6 states Correctly predicting elections from 1920-1932
E N D
Sampling • Probability Sampling • Based on random selection • Non-probability sampling • Based on convenience
Sampling Miscues: Alf Landon for President (1936) • Literary Digest: post cards to voters in 6 states • Correctly predicting elections from 1920-1932 • Names selected from telephone directories and automobile registrations • In 1936, they sent out 10 million post cards • Results pick Landon 57% to Roosevelt 43% • Election: Roosevelt in the largest landslide • Roosevelt 61% of the vote and 523-8 in Elect. Col. • Why so inaccurate?: Poor sampling frame • Leads to selection of wealthy respondents
Sampling Miscues: Thomas E. Dewey for President (1948) • Gallup picks winner 1936-1944 • Use quota sampling: • matches sample characteristics to population • Gallup quota samples on the basis of income • In 1948, Gallup picked Dewey to defeat Truman • Reasons: • 1. Most pollsters quit polling in October • 2. Undecided voters went for Truman • 3. Unrepresentative samples—WWII changed society since census
Non-probability Sampling • In situations where sampling frame for randomization doesn’t exist • Types of non-probability samples: • 1. Reliance on available subjects • convenience sampling • 2. Purposive or judgmental sampling • 3. Snowball sampling • 4. Quota sampling
Reliance on Available Subjects • Person on the street, easily accessible • Examples: • Mall intercepts, college students, e-polls • Frequently used, but usually biased • Notoriously inaccurate • Especially in making inferences about larger population, even with many respondents
Purposive or Judgmental Sampling • Dictated by the purpose of the study • Situational judgments about what individuals should be surveyed to make for a useful or representative sample • E.g., Using college students to study third-person effects regarding rap and metal music • 3pe: Others are more affected by exposure than self • Assessing effects on self and others • Using college students makes for homogeneity of self
Snowball Sampling • Used when population of interest is difficult to locate • E.g., homeless people, meth addicts • Research collects data from of few people in the targeted group • Initially surveyed individuals asked to name other people to contact • Good for exploration • Bad for generalizability
Quota Sampling • Begins with a table of relevant characteristics of the population • Proportions of Gender, Age, Education, Ethnicity from census data • Selecting a sample to match those proportions • Problems: • 1. Quota frame must be accurate • 2. Sample is not random, but can be representative
Probability Sampling • Goal: Representativeness • Sample resembles larger population • Random selection • Enhancing likelihood of representative sample • Each unit of the population has an equal chance of being selected into the sample
Population Parameters • Parameter: Summary statistic for the population • E.g., Mean age of the population • Sample allows parameter estimates • E.g., Mean age of the sample • Used as an estimate of the population parameter
Sampling Error • Every time you draw a sample from the population, the parameter estimate will fluctuate slightly • E.g.: • Sample 1: Mean age = 37.2 • Sample 2: Mean age = 36.4 • Sample 3: Mean age = 38.1 • If you draw lots of samples, you would get a normal curve of values
Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Likely population parameter Estimated Mean
Error and Sample Size • As the sample size increases: • The error decreases • In other words, large sample estimate is likely to be closer to the population parameter • As the sample size increases, we get more confident in our parameter estimate
Confidence Interval • Interval width at which we are 95% confident the estimate contains the population parameter • For example, we predict that Candidate X will receive 45% of the vote with a 3% confidence interval • We are 95% sure the parameter will be between 42% and 48% • The “margin of error” in a poll • Confidence interval shrinks as: • Error is smaller • Sample size is larger
Sample Size & Confidence Interval • How precise does the estimate have to be? • More precise: larger sample size • Larger samples increase precision • But at a diminishing rate • Each unit you add to your sample contributes to the accuracy of your estimate • But the amount it adds shrinks with additional unit added
95% Confidence Intervals Sample Size
Describe Sampling Frame • List of units from which sample is drawn • Defines your population • E.g., List of members of population • Ideally you’d like to list all members of your population as your sampling frame • Randomly select your sample from that list • Often impractical to list entire population
Sampling Frames for Surveys • Limitations of the telephone book: • Misses unlisted numbers/mobile numbers • SES and age bias: • Poor people may not have phone • Less likely to have multiple phone lines • Young people have mobile phone numbers • Most studies use a technique such as Random Digit Dialing as a way around this
Types of Sampling Designs • Simple Random Sampling • Systematic Sampling • Stratified Sampling • Multi-stage Cluster Sampling
Simple Random Sampling • Establish a sampling frame • A number is assigned to each element • Elements randomly selected into the sample • Use a random number generator
Systematic Sampling • Establish sampling frame • Select every kth element with random start • E.g., 1000 on the list, choosing every 5th name yields a sample size of 200 • Sampling interval: standard distance between units for the sampling frame • Sampling interval = pop. size / sample size • Sampling ratio: proportion of pop. selected • Sampling ratio = sample size / population size
Stratified Sampling • Modification used to reduce potential for sampling error • Research ensures that certain groups are represented proportionately in the sample • E.g., If the population is 60% female, stratified sample selects 60% females into the sample • E.g., Stratifying by region of the country to make sure that each region is proportionately represented
Cluster Sampling • Frequently, there is no convenient way of listing the population for sampling • E.g., Sample of Dane County or Wisconsin • Hard to get a list of the population members • Cluster sample • Sample of census blocks • List of census blocks, list people for selected blocks • Select sub-sample of people living on each block
Multi-stage Cluster Sample • Cluster sampling done in a series of stages: • List, then sample within • Example: • Stage 1: Listing zip codes • Randomly selecting zip codes • Stage 2: List census blocks within selected zip codes • Randomly select census blocks • Stage 3: List households on selected census blocks • Randomly select households • Stage 4: List residents of selected households • Randomly select person to interview