1 / 22

You want to survey a school

You want to survey a school. • You draw your sample from the first day of school student enrollment list This list would be your ____???____ Which students are not on this list? A phenomenon known as? Potentially problematic because? (Hint: Dillman, p. 196). Some reminders….

Download Presentation

You want to survey a school

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. You want to survey a school • You draw your sample from the first day of school student enrollment list • This list would be your ____???____ • Which students are not on this list? • A phenomenon known as? • Potentially problematic because? • (Hint: Dillman, p. 196)

  2. Some reminders… • Population: The group about whom we want to draw our inference • Sample Frame: Members of the population who could potentially be in our sample • Coverage Error: The extent to which members of population are excluded from sample frame (not good)

  3. Welcome… • …to a hopefully productive lesson on SAMPLING METHODOLOGY! • What’s ideal? • Nifty tricks?? • Common misconceptions??? • Limitations of our methods????????? • P.S. We are going to do (some) math and it is going to be FUN!!!

  4. Simple Random Sampling(what’s ideal) • Members of a sample frame, which hopefully includes our entire population, are selected one at a time • independently & without replacement • (Drawing names out of a hat) • Sample is equal in expectation to population on all outcomes, but no guarantees

  5. Stratified Random Sampling(possibly even more ideal) • Use criterion to divide sample frame by group membership (e.g. racial category) • Randomly sample within each group • What is the advantage of this procedure?

  6. Scenario… • We want to know what percentage of Americans support Obama for president • We need 1100 members from each racial group to be confident about group means (more on this later) • American Indians / Alaskan Natives comprise 1% of our population. • Through simple random sampling, how large of a sample would we theoretically need to reach n = 1100 for this subgroup?

  7. Scenario cont’d… • OR, we could use stratified random sampling and draw 1100 from each subgroup without all this trouble. • BUT, now we have oversampled from American Indians--they are over-represented in our sample! • Implications? • Solutions?

  8. (This data is very fake) • Proportion supporting B.O. African American: .50 Asian American: .50 Latino: .50 White: .50 American Indian: 0 Unweighted avg: ??

  9. Weighting (nifty trick) • Now, let’s do a weighted average instead… What’s going on here? 99% (.50) + 1% (0) = 49.50% • Big difference, eh?

  10. So, why was 1100 an ideal subgroup number? • Because no matter how large your population, a sample of 1100 will get you very close to the true population value if your outcome is binary (e.g. Obama: Yes or No) • How come?

  11. Because this man said so • William Sealy Gossett (1876-1937) • Chemist, “math person”, Guinness Brewery worker • A patient man

  12. Yes, a patient man • Using barley (somehow), spent two years empirically studying relationship between sample means and population means. • “The Probable (Standard) Error of a Mean” (1908) • Standard errors are what we use to estimate sampling error

  13. Sampling error • Describes how closely our sample mean allows us to estimate our population mean • Conceptually similar to a confidence interval (Dillman, p. 207; http://www.researchsolutions.co.nz/sample_sizes.htm • Depends on: Population variance (“spread”) (estimated by sample variance) Sample size Population size (to a point)

  14. Sampling error: big picture • Larger variances and (to a point) larger population sizes require larger samples to estimate the population mean at a given level of precision • Increasing sample size reduces sampling error, BUT there are diminishing returns to increasing our sample size

  15. Sampling error: big picture • Diminishing Returns? For large populations… Increasing “n” from 100 to 200 is helpful Increasing from 500-600 is less helpful Increasing from 1200-1300 helps very little (no matter how large the population)

  16. Why Diminishing Returns? • Because there is an upper bound (“ceiling”) on the variance of any sample. • For binary (Yes/no, “1” or “0”) outcomes, max variance is .25 • Thus, it’s only a matter of time till more “n” in the denominator makes our standard error very low

  17. Why Diminishing Returns? • Even for continuous outcomes, there is still an upper bound on variance unless scale is infinite • Thus, there are still diminishing returns on increasing “n” • For more on this topic… -take S-012 -look up Confidence Intervals in stats books “You don't need a large sample of users to obtain meaningful data:
Continuous Data (e.g. Task Time)” http://www.measuringusability.com/sample_continuous.htm

  18. Limitations of Sampling error calculations • Does not take coverage error into account! • Assumes you have drawn an simple random sample (e.g. does not take “clustering” into account)

  19. Clustering??? • There are 20,000 students in a city with 40 schools. We want a sample of 1100 • Ideally, we would draw students at random from every school. • But, it would be cheaper and easier if we drew a few schools at random and obtained information from every student • Implications?

  20. Clustering??? • If there is a lot of school-level variation in our outcome, our sample will not be representative and our sample estimate will be biased. • Sampling error formula does not account for this possibility

  21. One more limitation of sampling error formula • Non-response bias • Even if you have drawn a beautifully random sample, your sample estimate will be biased if those who do not return your survey are different on your outcome of interest. • That’s why Dillman’s advice on getting high response rates is so important!

More Related