780 likes | 1.36k Views
5.1 Objectives. Define population and sample Explain how sampling differs from a census Explain what is meant by a voluntary response sample Give an example of a voluntary response sample Explain what is meant by convenience sampling Define what it means for a sampling method to be biased
E N D
5.1 Objectives • Define population and sample • Explain how sampling differs from a census • Explain what is meant by a voluntary response sample • Give an example of a voluntary response sample • Explain what is meant by convenience sampling • Define what it means for a sampling method to be biased • Define, carefully, a simple random sample • List the four steps involved in choosing a SRS • Explain what is meant by systematic random sampling • Use a table of random digits to select a simple random sample
5.1 Objectives • Define a probability sample • Given a population, determine the strata of interest, and select a stratified random sample • Define a cluster sample • Define undercoverage and nonresponse as sources of bias in sample surveys • Give an example of response bias in a survey question • Write a survey question in which the wording of the question is likely to influence the response • Identify the major advantage of large random samples
Chapter 5 Producing Data
To prepare for today…. • Read Pages 329-332.
Observational Study versus Experiment • Observational Study • Observe individuals and measure variables of interest but do not attempt to influence the response. • Experiment • Deliberately impose some treatment on individuals in order to observe their responses. • A response variable measures an outcome of a study. • An explanatory variable helps explain or influences changes in a response variable.
Population versus Sample • Population • The entire group of individuals that we want information about. • Sample • A part of the population that we actually examine in order to gather information
Census…Sampling • Census • This attempts to contact every individual in the entire population • Sampling • This involves studying part in order to gain information about the whole
Why sample? • What are some reasons why we use samples instead of a census?
Example • Students as customers. • A committee relations in a college town plans to survey local businesses about the importance of students as customers. From telephone book listings, the committee chooses 150 businesses at random. Of these, 73 return the questionnaire mailed by the committee. • What is the population for this sample survey? • What is the sample? • What is the proportion of people that did not respond?
Types of Samples • Voluntary Response Sample • Consists of people who choose themselves by responding to a general appeal. • Biased because people with strong opinions (especially negative opinions) are most likely to respond.
Types of Samples • Convenience Sampling • Choosing individuals that are easiest to reach.
Bias • A sampling method is biased if it systematically favors certain outcomes. • A statistic is said to be unbiased if the mean of the sampling distribution equals the mean of the population.
Bad Sampling Techniques • Call in to voice your opinion about gun control. Each call cost 50 cents. • Vote for the top album of the year by texting 3334. Text messaging fees apply. • National Health Care Program
Assignment • Assignment: Pg. 333 - 334: 5.1, 5.3, 5.5, 5.7, 5.8 • Read 334-341, 343-349
Sampling Methods • Simple Random Sample (SRS) • A SRS of size n consists of n individuals from the population chosen in such a way that every set of size n has an equally likely chance to be the sample actually selected. • Think of putting everyone’s name in a hat and drawing.
Random Digits • A table of random digits is a long string of the digits 0 – 9 with these two properties • Each entry in the table is equally likely to be any of the 10 digits • The entries are independent of each other
Using the table to select a SRS • Step 1: Label • Assign a numerical label to every individual in the population • Step 2: Table • Use Table B (back cover) to select numbers at random • Step 3: Stopping Rule • Indicate when you should stop sampling • Step 4: Identify Sample • Use the labels to identify the subjects selected in the sample
Example • Spring Break Destinations. • A campus newspaper plans a major article on spring break destinations. The authors intend to call a few randomly chosen resorts at each destination to ask about their attitudes toward groups of students as guests. • The table contains the resorts listed in one city. The first step is to label this population as shown. • Enter Table B at line 140, and choose three resorts.
Now Using Technology Demonstration of Generating Random Integers
Sampling Methods • Probability Sample • Sample chosen by chance. • We must know what samples are possible and what chance, or probability, each possible sample has.
Sampling Methods • Stratified Random Sample • First divide the population into groups of individuals (called strata). • Individuals in each strata are similar in some way that is important to the response • Then choose a separate SRS in each stratum and combine these SRSs to form the full sample.
Sampling Methods • Cluster Sampling • Divide the population into groups, or clusters. • Some of these clusters are randomly selected. • All individuals in the chosen clusters are selected to be in the sample.
Cautions about Sample Surveys • Undercoverage Bias • Occurs when some groups in the population are left out of the process of choosing the sample.
Cautions about Sample Surveys • Nonresponse Bias • Occurs when an individual chosen for the sample can’t be contacted or does not cooperate. Let’s consider variables that may affect nonresponses. How do these variables affect the makeup of the sample? (e.g. calling Nebraskans between 7 p.m. – 9 p.m. during June) Is this class a good sample to gather information about the number of children in a family? What if we asked in Colorado City, AZ?
Cautions about Sample Surveys • Response Bias • Influences caused by the behavior of the respondent or of the interviewer • Are people always honest if its about illegal or unpopular behavior? Example: Have you visited the dentist in the last 6 months? (maybe it was 8 months?)
GIGO • Garbage In Garbage Out • Need to learn about producing data that can be used for making valid statements. • Need to recognize techniques that produce biased samples.
Question Wording • The wording of the question is the most important influence on the answers given in a sample survey. • Confusing questions • Leading Questions • Choice of words
Questioning It is estimated that disposable diapers account for less than 2% of the trash in today’s landfills. Given this, in your opinion, would it be fair to ban disposable diapers? or…. It is estimated that disposable diapers account for less than 2% of the trash in today’s landfills. In contrast, beverage containers, third class mail and yard waste are estimated to account for 21% of the trash in landfills. Given this, in your opinion, would it be fair to ban disposable diapers?
Ask • Insist on knowing the following before believing a survey. • The exact question asked • The rate or nonresponse • The date of the survey • The method of the survey
Assignment Pg. 341: 5.10 Pg. 347: 5.16, 5.17, 5.19 Pg. 349: 5.22, 5.28 Read Section 5.2: Pg. 353-371
5.2 Designing Experiments Pg. 357: 5.34, 5.35, 5.37, 5.39, 5.40
5.2 Objectives • Define experimental units, subjects and treatments. • Define factor and level. • Given the number of factors and the number of levels for each factor, determine the number of treatments. • Explain the major advantage of an experiment over an observational study. • Give an example of the placebo effect. • Explain the purpose of a control group. • Explain the difference between control and a control group. • Discuss the purpose of replication, and give an example of replication in the design of an experiment. • Discuss the purpose of randomization in the design of an experiment. • Given a list of subjects, use a table of random numbers to assign individuals to treatment and control groups.
5.2 Objectives • List the three main principles of experimental design. • Explain what is means to say that an observed effect is statistically significant. • Define a completely randomized design. • For an experiment, generate an outline of a completely randomized design. • Define a block. • Give an example of a block design in an experiment. • Explain how a block design may be better than a completely randomized design. • Give an example of a matched pairs design, and explain why matched pairs are an example of block designs. • Give an example in which lack of realism negatively affects our ability to generalize the results of a study.
Vocabulary • Experimental Units • The individuals on which the experiment is done. • Subjects • When the units are humans they are called subjects. • Treatment • A specific experimental condition applied to the units is called a treatment.
Response variables measures an outcome of a study • These are also called dependent variables. • Explanatory variables help explain or influences changes in a response variable. • These are also called factors, and independent variables.
Principles of Experimental Design • Control • The purpose of control is to try to eliminate the confounding effects of lurking variables. • Replication • Randomization
Control • Placebo Effect • Control Group
Example Improving Response Rate • How can we reduce the rate of refusals in telephone surveys? Most people who answer at all listen to the interviewer’s introductory remarks and then decide whether to continue. • One study made telephone calls to randomly selected households to ask opinions about the next election. In some calls the interviewer gave her name, in others she identified the university she was representing, and in still others she identified both herself and the university. • For each type of call the interviewer either did or did not offer to send a copy of the final survey results to the person interviewed. Do these differences in the introduction affect whether the interview is completed? • Identify the experimental units or subjects, the factors, the treatments and the response variables.
Identify the experimental units or subjects, the factors, the treatments, and the response variables. Yesterday’s cookie taste testing experiment.
Identify the experimental units or subjects, the factors, the treatments, and the response variables. Ability to grow in shade may help pines found in the dry forests of Arizona to resist drought. How well do these pines grow in shade? Investigators planted pine seedlings in a greenhouse in either full light, light reduced to 25% of normal by shade cloth, or light reduced to 5% of normal. At the end of the study, they dried the young trees and weighted them.
Quick Quiz 1. A professor decides to make his class notes available in electronic form on the Internet. At the end of the quarter, several students mention in the course evaluation that having the notes readily available helped them do well in the class. This is an example of A. an observational study.B. an experiment.C. neither of the above.
2. Researchers in Britain randomly divided a large number of preterm babies into three groups. One received donated breast milk, one received infant formula made for preterm babies, and the third received regular infant formula. Each diet was used for one month as a sole food or as a supplement to mother’s milk. Sixteen years later, the children returned and had their blood pressure measured. It was found that diastolic and systolic blood pressure both tended to be lower in the children who were fed breast milk than in the children who were fed formula. This study is an example of A. an experiment. B. an observational study.C. a census.
Suppose you would like to determine which age groups in the United States (18-29, 30-49, 50-64, 65 or older) currently identify watching television as their favorite way to spend an evening. The most appropriate statistical study to answer this question would be A. a survey. B. an observational study that is not a survey. C. an experiment. 3.
When ordering vinyl replacement windows, the following variables are specified for each window. Which of these variables is quantitative? A.window style—double-hung, casement, or awningB. area of the window opening in square inchesC. window style—single-pane or double-pane 4.
A zoologist studying adult bears measures a number of different variables. Which of the following possible variables is categorical? A. the weight in pounds of an adult bearB. the level of aggression (low, moderate, high) displayed by an adult bearC. The number of fish an adult bear eats in a particular day. 5.
A poll of American adults’ opinions about efforts to reform Social Security was conducted in 2004-2005 by the AARP, the nation’s largest organization for retired people. The poll results were criticized in some quarters because they included no respondents under the age of 30, even though voters aged 18 to 29 made up 17% of the 2004 electorate. By contrast, respondents aged 60 and above made up 34% of the sample but were only 24% of the electorate. This poll is most likely subject to which of the following types of bias? A. undercoverageB. nonresponseC. response bias 6.
You would like to compare the level of mathematical knowledge among 15-year-olds in the United States and Japan. To do this, you plan to give a mathematics achievement test to random samples of 1000 15-year-olds in each of the two countries. To ensure that the samples will include individuals from all different socioeconomic groups and educational backgrounds, you will randomly select 200 students from low-income families, 400 students from middle-income families, and 400 students from high-income families in each country. The sampling procedure being used here is A. simple random sampling.B. voluntary response sampling.C. stratified sampling. 7.
You want to know the opinions of American high school teachers on the issue of establishing a national proficiency test as a prerequisite for graduation from high school. You obtain a list of all high school teachers belonging to the National Education Association (the country’s largest teachers’ union) and mail a survey to a random sample of 2500 teachers. In all, 1347 of the teachers return the survey. Which of the following statements about this situation is true? A. The sampling frame is the set of all high school teachers who are members of the NEA.B. The population is the set of all high school teachers who are members of the NEA.C. The sample is the set of 2500 teachers to whom you send the survey. 8.
A sociologist wants to study the attitudes of American male college students toward marriage and husband-wife relations. She gives a questionnaire to 25 of the men enrolled in Sociology 101 at her college. All 25 men complete and return the questionnaire. The sample in this situation is A. all men taking a comparable sociology class.B. the 25 men who received and returned the questionnaire.C. all men in the Sociology 101 class. 9.