1 / 57

Lecture 2

Lecture 2. Surveys and Sampling. Dr. Richard Bu ß mann. MIP & lecture materials. You will be able to find an MIP and lecture materials at this link: http://www.uwcentre.ac.cn/hhu/? p=4743

Download Presentation

Lecture 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 2 Surveys and Sampling Dr Richard Bußmann Dr. Richard Bußmann

  2. MIP & lecture materials • You will be able to find an MIP and lecture materials at this link: • http://www.uwcentre.ac.cn/hhu/?p=4743 • Alternatively, you may obtain them from the desktop of this computer. Just bring your memory stick, plug it in and copy the files. Dr Richard Bußma22nn

  3. Effective sampling • We’d like to know about an entire collection of individuals, called a population, but examining all of them is usually impractical, if not impossible. • So we select a smaller group of individuals, a sample, from the population. • Hence, instead of examining everybody, we examine a part of a whole, a fraction, a sample Dr Richard Bußmann

  4. Effective sampling • A sample surveyis designed to query / ask a small group of people in the hope of learning something about the entire population. • Samples that over- or underemphasize some characteristics of the population are said to be biased. • Suppose we want to survey this class with respect to their opinion of 驻马店 or DOTA or beer. • How can we get a biased sample? • How can we get an unbiased sample? Dr Richard Bußmann

  5. Effective sampling • When a sample is biased, the characteristics of the sample differ from the characteristics of the population it is trying to represent. • I.e.: If we only ask non-smokers about smoking, we likely receive a biased survey result. • Therefore and to make a sample as representative as possible, select individuals for the sample at random. Dr Richard Bußmann

  6. Randomizing • Randomizing protects us by giving us a representative sample even for effects we were unaware of. • Randomizing protects us from the influences of all the features of our population by making sure that on average the sample looks like the rest of the population. • The essential feature of randomness that we need is that the selection is “fair.” Dr Richard Bußmann

  7. Randomizing • How well can a sample represent the population from which it was selected? • The table below gives results from two samples, each of 8000 individuals at random from the population. • Notice how the means (averages) and proportions are similar on all seven variables. Dr Richard Bußmann

  8. Randomizing • If a survey is given to multiple random samples, the samples will differ from each other and therefore, so will the responses. • These sample-to-sample differences are referred to as sampling error. • Let’s be clear about what this means. • Don’t take sampling error literally. There’s no true error. What is meant by sampling error is the variation in the survey results of different subgroups of one population. Dr Richard Bußmann

  9. Sample error • Suppose I survey you on your opinion of DOTA or beer or military training. • Instead of investigating everyone’s opinion, I query 3 groups of 10 randomly selected people. The quantity by which the respective groups approval or disapproval of DOTA, beer and military training differs is the sampling error. • I have chosen groups of 10, how big do you think is a representative sample to survey this class opinion of DOTA, beer or their preference of 王老吉 over 加多宝, respectively? Dr Richard Bußmann

  10. Sample Size • The size of the sampledetermines what we can conclude from the data • This is almost regardless of the size of the population.What fraction of the population you sample doesn’t matter, as long as the sample size is big / random enough Dr Richard Bußmann

  11. Census – Entire population sample • A “sample” that includes the entire population is called a census • As far as I am aware it was the Romans who started this practise in Europe (to better collect tax) • A census does not always provide the best possible information about the population: • It can impractical, difficult and expensive to complete a census • The population we’re studying may change, nonetheless censuses occur every 10-15 years or so Dr Richard Bußmann

  12. Parameters • Models use abstractions, i.e. formulas (mathematics) to represent a reality. • We call the key abstractions in those models parameters. • A parameter used in a model for a population is called a population parameter. Dr Richard Bußmann

  13. DEFINITIONS • Any summary found from the data is a statistic. • Sometimes, especially when we match statistics with the parameters they estimate, we use the term sample statistic. • A sample that estimates the corresponding parameters accurately is said to be representative. Dr Richard Bußmann

  14. Sample designs • Simple Random Sample (SRS) • A sample drawn so that every possible sample of the size we plan to draw has an equal chance of being selected is called a simple random sample,usually abbreviated SRS. • With this method each combination of individuals has an equal chance of being selected as well. Dr Richard Bußmann

  15. Sample designs • Simple Random Sample (SRS) • A sampling frameisa list of individuals from which the sample will be drawn. • Once we have a sampling frame, we can assign a sequential number to each individual in the sampling frame and draw random numbers to identify those to be sampled. • This isakintomepicking 10 ofyourstudent IDs from a listandaskingyou to be my sample population. Dr Richard Bußmann

  16. Sample designs • Simple Random Sample (SRS) • An alternative method is to assign a random number to each member of the sampling frame, sort the random numbers, carrying along the identities of the individuals in the sampling frame, and then picking a random sample of any size off (the top of) the sorted list. • Sample-to-sample differences in the values for the variables we measure are called sampling variability. Dr Richard Bußmann

  17. Sample designs • Stratified Sampling • Firstwe slice the population into homogeneous (均匀jūnyún) groups, called strata. We then use simple random sampling within each stratum, and combine the results at the end, this is called stratified random sampling. • The plural of “stratum” is “strata”, just like the plural of “equilibrium” is “equilibria” • Reduced sampling variability is the most important benefit of stratifying. Dr Richard Bußmann

  18. Stratifying you • Let’s stratify you. • Suppose we want to investigate the 高考 results of this class (150 students), but know that students from different provinces need different marks to attend 黄淮学院. • If our sample of 31 students would (by accident) pick up only students from Beijing, Shanghai and Tianjin, it might not be representative of our student population. • Hence we stratify our sampling by province, which we then investigate. Dr Richard Bußmann

  19. Stratifying you • We stratify our sampling geographically and then pick one observation from each province to get 31 observation points. • Hence we can be sure that scores from every region are represented in our sample. • This allows for geographic variation of results and our results should be fairly similar, i.e. less diverse in repeated sampling. • Therefore our sampling variation (and hence sampling error) should fall. Dr Richard Bußmann

  20. More sample designs • Cluster and Multistage Sampling • Splitting the population into parts or clustersthat each represent the population and performing a census within one or a few clusters at random is called cluster sampling. • This is the UKs method of conducting a population census. • Sampling schemes that combine several methods are called multistage samples. Dr Richard Bußmann

  21. More sample designs • Systematic Samples • A systematic sampleis created by selecting individuals systematically. • For example, we might select every tenth person on an alphabetical list of employees. • To make sure our sample is random, we still must start the systematic selection with a randomly selected individual. Dr Richard Bußmann

  22. Sampling in REALITY • When gathering real data, things can be a bit messier than in this somewhat idealized setting presented so far. • Here are some things to consider: • The population may not be as well-defined as it seems. • Even when the population is clear, it may not be possible to establish an appropriate sampling frame. Dr Richard Bußmann

  23. Sampling in REALITY • Usually, the practical sampling frame is not the group you really want to know about. • You won’t get responses from everyone your design selects. • The who of our study keeps changing, and each constraint can introduce biases. Dr Richard Bußmann

  24. Bar Statistics • Researchers wait outside a bar they randomly select from a list of bars. They stop every 10th person who comes out of the bar and ask whether he or she thinks drinking and driving is a serious problem. • Identify the population of interest, population parameter, sampling frame and method. • Population of interest – • Population parameter – • Sampling frame – • Method – Dr Richard Bußmann

  25. Bar Statistics • Population of interest – • Population parameter – • Sampling frame – • Method – • Adults • Proportion who think drinking and driving is a serious problem • Bar patrons • Systematic sampling What problems can you foresee with this sample setup? Dr Richard Bußmann

  26. Written surveys • A business magazine mails a questionnaire to the HR (human resource) directors of all Fortune 500 companies, and receives responses from 23% of them. Those responding report that they do not think that such surveys intrude significantly on their workday. • Identify the population of interest, population parameter, method, and any biases. Dr Richard Bußmann

  27. Written surveys • Population of interest – • Population parameter – • Method – • Bias – • Fortune 500 HR directors • Proportion who don’t feel surveys intruded on their work day • non random • Nonresponse. And, hard to generalize because who responded is directly related to the question. Dr Richard Bußmann

  28. Survey Design • Example: Amusement Park Riders • An amusement park has opened a new roller coaster. It is so popular that people are waiting for up to 3 hours for a 2-minute ride. Concerned about how patrons feel about this, they survey every 10th person on the line for the roller coaster, starting from a randomly selected individual. Identify the sampling frame. Is the sample likely to be representative? Is it biased? Dr Richard Bußmann

  29. Survey design • Sampling Frame – • Representative – • Patrons in line on that day at that time. • No. Only those who think it worth the wait are likely to be in line. Also, those who don’t like roller coasters aren’t in the sampling frame, so the poll will not get a fair picture of whether park patrons feel about long lines for roller coaster rides. Dr Richard Bußmann

  30. Valid surveys • A survey that can yield the information you need about the population in which you are interested is a valid survey. • To help ensure a valid survey, you need to ask four questions: • What do I want to know? • Who are the right respondents? • What are the right questions? • What will be done with the results? Dr Richard Bußmann

  31. Valid surveys • Know what you want to know. • You must be clear about what you hope to learn and about whom you hope to learn it. • Perhaps the most common error is to ask unnecessary questions. • Don’t bore people to death. They will strike back! Dr Richard Bußmann

  32. Valid surveys • Use the right sampling frame. • A valid survey obtains responses from appropriate respondents. • It is important to be sure that your respondents actually know the information you hope to discover. • There is no point to Muslims about the taste of pork. Dr Richard Bußmann

  33. Valid Surveys • Ask specific rather than general questions. • Watch for biases. • If individuals who don’t respond have common characteristics, your sample will suffer from nonresponse bias and will no longer represent the population • When respondents volunteer to participate, individuals with the strongest feelings on either side of an issue are more likely to respond; those who don’t care may not bother, creating a voluntary response bias. Dr Richard Bußmann

  34. Be careful with question phrasing! • A respondent may not understand the question the way the researcher intended it. • (A respondent may be offended by the question, as it was posed.) • A respondent may be intimidated by the survey or the method of surveying. • If I put a pistol to your head, you may not provide me with an objective answer to my question. Dr Richard Bußmann

  35. Valid Surveys • Be careful with answer phrasing, too! • Inaccurate responses, known as measurement errors, occur when the question does not take into account all possible answers. • Some answers may be beyond the consideration of the survey author, due to the authors own bias. • The best way to prevent measurement errors is a pilot test, in whicha small sample is drawn from the sampling frame, and a draft form of the survey instrument is administered. Dr Richard Bußmann

  36. Valid surveys • Example: Biased Questions • The following question appears on a sample survey. Is the question biased? If so, how? Suggest changes. • “Should companies that pollute the environment be compelled to pay the costs of cleanup?” • Question Biased – • Suggested changes – Dr Richard Bußmann

  37. Framing questions • “Should companies that pollute the environment be compelled to pay the costs of clean-up?” • Question Biased –Yes, because of the word “pollute”, the question is also lacking specifics. • Possible alternatives • “Should companies be responsible for the cost of environmental clean-up?” • “Should companies shoulder the cost of environmental pollution that occurs during their normal course of business?” Dr Richard Bußmann

  38. Biased Questions • The following questions appear on a sample survey. Are they biased? If so, suggest changes. • “Do you think that price or quality is more important in selecting an MP3 player?” • “Do you think that price or quality is more important in selecting an iPad?” • What is your favourite car brand? • Do you like my class? Dr Richard Bußmann

  39. bias • “Do you think that price or quality is more important in selecting an MP3 player?” • “Do you think that price or quality is more important in selecting an iPad?” • What is your favourite car brand? • Do you like my class? • Seems like a fair question to me. • This question only considers iPad’s and no tablets by competing manufacturers. • The BMW logo reminds you of BMW and influences your decision • That may be the first question I ask, how about if I ask about your student ID next? Dr Richard Bußmann

  40. Bias in wordings • The following question appears on a sample survey. Is the question biased? If so, suggest changes. • “Should a company enforce a strict dress code?” Dr Richard Bußmann

  41. Bias in wordings “Should a company enforce a strict dress code?” Words such as “enforce” and “strict” bias the question toward the answer “no”. “Should companies have (formal) dress codes?” is a better phrasing. • Question Bias – • Suggested changes – Dr Richard Bußmann

  42. Rather Poor sampling techniques • Voluntary Response Sample • In a voluntary response sample,a large group of individuals is invited to respond, and all who do respond are counted. • Voluntary response samples are almost always biased, and so conclusions drawn from them are almost always wrong. Dr Richard Bußmann

  43. Rather Poor sampling techniques • Convenience Sampling • In convenience samplingwe simply include the individuals who are convenientto sample. • I.e. those willing to fill in lengthy (and boring) surveys. • Unfortunately, this group may not be representative of the population. Dr Richard Bußmann

  44. Rather Poor sampling techniques • Bad Sampling Frame • An SRS from an incomplete sampling frame introduces bias, because the individuals included may differ from the ones not in the frame. Dr Richard Bußmann

  45. Rather Poor sampling techniques • Undercoverage • Many survey designs suffer from undercoverage. • I.e. some portion of the population is not sampled at all or has a smaller representation in the sample than it has in the population. • Rather than sending out a large number of surveys for which the response rate will be low, it is often better to design a smaller, randomized survey for which you have the resources to ensure a high response rate. Dr Richard Bußmann

  46. Rather Poor sampling techniques • Undercoverage • Suppose we sample your grandparents on how they perceive your education. • Suppose half of them are sent a written survey. • The other half is visited and interviewed by me. • I can imagine a couple of problems, can you? Dr Richard Bußmann

  47. Rather Poor sampling techniques • Example: Sampling Methods • We want to know what percentage of local doctors accept 红包 frompatients. We call the offices of 50 doctors who advertised in a local newspaper. • What is the sampling method? • Is this sampling method appropriate? If not, identify the problem. • Do you think the question is any good (to ask via the phone)? How would you tackle it? Dr Richard Bußmann

  48. Rather Poor sampling techniques • Method appropriate • Question • Doctors aren’t selected from all registered doctors, but only from the advertising population. These doctors can’t represent the population of doctors. • Doctors would have to be really stupid to admit to corruption and tax dodging on the phone. Trial and error doctor visits may be more effective. Dr Richard Bußmann

  49. Rather Poor sampling techniques • Example: Sampling Methods • We want to know what percentage of local businesses anticipate hiring additional employees in the upcoming months. We randomly selected a page in the local Yellow Pages and call every business listed there. • Is this sampling method appropriate? If not, identify the problem. Dr Richard Bußmann

  50. Rather Poor sampling techniques • Example: Sampling Methods • We want to know what percentage of local businesses anticipate hiring additional employees in the upcoming months. We randomly select a page in the local Yellow Pages and call every business listed there. • Method appropriate –Not appropriate. This cluster sample will probably contain listings for only one or two business types. Dr Richard Bußmann

More Related