280 likes | 399 Views
Behave how you know perfectly well you are expected to. - Sign in Room 218. Collecting Samples. Chapter 2.3 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U. Why Sampling?. sampling is done because a census is too expensive or time consuming
E N D
Behave how you know perfectly well you are expected to. - Sign in Room 218
Collecting Samples Chapter 2.3 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U
Why Sampling? • sampling is done because a census is too expensive or time consuming • the difficulty is being confident that the sample represents the population accurately • convenience sampling occurs when you simply take data from the most convenient place (for example collecting data by walking around the hallways at school) • convenience sampling is not representative
Random Sampling • representative sampling almost always uses random samples • random numbers are described as numbers that occur without pattern • random events are events that are considered to occur by chance • random numbers can be generated using a calculator, computer or random number table • random choice is used as a method of selecting members of a population without introducing bias
Simple Random Sampling • this sample requires that all selections be equally likely and that all combinations of selections be equally likely • the sample is likely to be representative of the population • but if it isn’t, this is due to chance • example: put entire population’s names in a hat and draw them
Systematic Random Sampling • you decide to sample a fixed percent of the population using some random starting point and you select every nth individual • n in this case is determined by calculating the sampling interval (population size divided by sample size) • example: you decide to sample 10% of 800 people. Generate a random number between 1 and 10, start at this number and sample each 10th person (n = 800 / 80 = 10)
Stratified Random Sampling • the population is divided into groups called strata (which could be MSIPs or grades) • a simple random sample is taken of each of these with the size of the sample determined by the size of the strata • example: sample CPHS students by MSIP, with samples randomly drawn from each MSIP (the number drawn is determined by the size of the MSIP)
Cluster Random Sampling • the population is ordered in terms of groups (like MSIPs or schools) • groups are randomly chosen for sampling and then all members of the chosen groups are surveyed • example: student attitudes could be measured by randomly choosing schools from across Ontario, and then all students in these schools are surveyed
Multistage Random Sampling • groups are randomly chosen from a population and then individuals in these groups are then randomly chosen to be surveyed • example: to understand student attitudes a school might randomly choose MSIPs, and then randomly choose students from within these MSIPs
Destructive Sampling • sometimes the act of sampling will restrict the ability of a surveyor to return the element to the population • example: cars used in crash tests cannot be used again for the same purpose • example: individuals may acquire learning during sampling that would introduce bias if they were used again (like taking a test twice)
Example: do students at CPHS want a longer lunch? • Simple Random Sampling • have a computer generate 200 names and interview each • Systematic Random Sampling • sampling interval = 800 / 200 = 4 • generate a random number from 1-4 • start with that number on the list and interview each 4th person after that
Example: do students at CPHS want a longer lunch? • Stratified Random Sampling • group students by grade and have a computer generate a random group of names from each grade to interview • the number of students interviewed from each grade is not equal, rather it is proportional to the size of the group • if there were 200 grade 10’s we would need to interview 50 of these
Example: do students at CPHS want a shorter lunch? • Cluster Random Sampling • randomly choose enough MSIPs to sample 200 students • say there are 25 per MSIP, we would need 8 MSIPs (8 x 25 = 200) • interview each student in each of these rooms
Example: do students at CPHS want a shorter lunch? • Multi Stage Random Sampling • group students by MSIP • randomly choose 20 MSIPs • randomly choose 10 students from each MSIP • interview each of these students
Sample Size • the size of the sample will have an effect on the reliability of the results • the larger the better • factors: • variability in the population (the more variation, the larger the sample required to capture that variation) • degree of precision required for the survey • the sampling method chosen
Techniques for Experimental Studies • Experimental studies are different from studies where a population is sampled as it exists • in experimental studies some treatment is applied to some part of the population • however, the effect of the treatment can only be known in comparison to some part of the population that has not received the treatment
Vocabulary • treatment group • the part of the experimental group that receives the treatment • control group • the part of the experimental group that does not receive the treatment
Vocabulary • placebo • a treatment that has no value given to the control group to reduce bias in the experiment • no one knows whether they are receiving the treatment or not (why?) • double-blind test • in this case, neither the subjects or the researchers doing the testing know who has received the treatment (why?)
Exercises • try page 99 #1,5,6,10,11 • for 6b, see example 1 on page 95
Creating Questions Chapter 2.4 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U
Surveys • these are commonly used in data collection • can be conducted by interview, mail-in, telephone, internet • they are a series of carefully designed questions • bad questions lead to bad data • good questions may create good data
Question Styles Open Questions • respondents answer in own words • gives a wide variety of answers • may be difficult to interpret • offer the possibility of gaining data you did not know existed • sometimes used in preliminary collection of information, to gain a sense of what is going on and possibly define the categories of data you will end up studying
Question Styles Closed Questions • questions that require the respondent to select from pre-defined categories of responses • options may be easily analyzed • options present may bias the result • options may not represent the population and researcher may miss what is going on • sometimes used after an initial open ended survey as the researcher has already identified data categories
Types of Survey Questions • Information • ex: circle the correct response • Gender M F • Checklist • ex: Subjects currently being taken (check all that apply): □ Math □ Computer Science □ Music
Types of Survey Questions • Ranking Questions • ex: rank the following in order of importance (1 = most important, 3 = least important) • __ Health Care __ Security __ Tax Relief • Rating Questions • ex: How would you rate your teacher? □ Great □ Fabulous □ Incredible □ Outstanding
Questions should… • Be simple, relevant, specific, readable • Be written without jargon/slang, abbreviations, acronyms, etc. • Not lead the respondents • Allow for all possible responses on closed Qs • Be sensitive to the respondents • Not be open to interpretation • Be as brief as possible
Exercises • try page 105 #1, 2 abc, 4, 5, 8, 9, 12
References • Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page