280 likes | 767 Views
Sampling . The Statistical Adventure Begins. Populations. Def: Census Sample Which is better? census? sample?. Step 1: Define the Target Population. Must be very specific: What is a user? What demographics matter? Are there geographic boundaries? What is the relevant time period?
E N D
Sampling The Statistical Adventure Begins
Populations Def: Census Sample Which is better? census? sample?
Step 1: Define the Target Population Must be very specific: What is a user? What demographics matter? Are there geographic boundaries? What is the relevant time period? What is an element?
Step 2: Specify a Sampling Frame Def: Where can you get a sampling frame?
Sample Frame Error List may not match the target population over-registration under-registration
Step 3: Selecting a Sampling Method Probability samples example: Non-probability samples example:
What’s the Big Deal? Probability samples let us estimate _________ We can calculate a confidence interval So, probability samples are more representative than non-probability samples. true false
Simple Random Sampling Probability sample Number each unit in the sampling frame Pick ___ units using a random numbers table NOT haphazard
Take a Simple Random Sample (SRS) of n=3 Element Attitude toward Motel 6 Natasha 6 Scotty 7 Kalie 4 Lynn 2 Gregory 8 Paul 4 John 7
Stratified Sample Decide on stratification variable Homogeneity with respect to the dependent variable w/in the group Divide population into a few mutually exclusive and exhaustive strata Take a SRS from each strata
Proportionate Stratified Sample Choose sample from strata in same proportion as they are in the population NOTE: Use when you have equal variance within the strata Population Sample Strata proportion proportion n=200 Fresh Soph Junior Senior
Disproportionate Stratified Sample Take a larger sample from the strata with ________ variance What is variance? Exercise: Develop two populations with 8 elements each. Population 1: high variance, low mean Population 2: low variance, high mean
Disproportionate Stratified Sample Population Sample Strata Variance proportion proportion Fresh Soph Junior Seniors
Why use Stratified Samples? Make sure that you include certain subgroups More precise, IF we use the right stratification variable margin of error is ___________ sampling distribution is __________ confidence intervals are __________ What is the right variable?
Cluster Sampling Divide population into lots of heterogeneous clusters Take a SRS of clusters Either: Single stage: sample all elements in the selected clusters OR Multi-stage: take a SRS of elements in the selected clusters
Why use Cluster Samples Cheap Easy Likely to be the way the sampling frame is set up Problem not precise, lacks statistical efficiency
Non-probability Sample: Cannot estimate margin of error Convenience or accidental sample: select subjects because they are the most convenient or readily available If the sample size is really large, we know we have a representative sample true false
Judgment or Purposive Sample Elements selected because they can serve the research purpose--they are believed to be representative Snowball sample
Quota Sample Attempts to be representative by sampling characteristics in the same proportion as the population Interviewer chooses sample Are these representative? _____
Step 4: Determine the Sample Size Must take into consideration: cost time industry standards statistical precision Discuss this in detail in the next chapter
Step 5: Select Elements Actually collect the data Clean-up the data - Editing Put the data into the computer
Characteristics of Interest Population N U (mu) 2 (sigma squared) (sigma) Sample n X (x bar) Sx2 Sx # of elements Mean Variance Standard Deviation
Step 6: Estimate the Characteristics of Interest Sample mean: sum of the sample elements X= number of elements in sample Sample variance = Sx2 sum of deviations around the mean squared sample size minus 1
Sample Standard Deviation The square root of the sample variance = sx Has a specific meaning Think Chebychev’s Theorem
Sampling Error The difference between the : population parameter and the sample statistic We look at confidence intervals to estimate this but not until the next chapter
Non-sampling Error (i.e., all other kinds of errors except for sampling error!)
Sampling frame Poor questions Poor branching Item non-response Non-response Interviewer bias Interviewer cheating Coding and editing problems Types of Non-Sampling Error
Which is the Larger Problem? (and why) Sampling error Non-sampling error