570 likes | 689 Views
Chapter 5. Producing Data. Objectives:. Distinguish between, and discuss the advantages of, observational studies and experiments. Identify and give examples of different types of sampling methods including a clear definition of simple random sample.
E N D
Chapter 5 Producing Data
Objectives: • Distinguish between, and discuss the advantages of, observational studies and experiments. • Identify and give examples of different types of sampling methods including a clear definition of simple random sample. • Identify and give examples of sources of bias in sample surveys. • Identify and explain the three basic principles of experimental design. • Explain what is meant by a completely randomized design. • Distinguish between the purposes of randomization and blocking in an experimental design.
Types of Statistical Studies: • Simulation: A physical or computer model is used to study phenomena and collect data • Used when its dangerous or impractical to use other methods
Types of Statistical Studies: • Census: collecting data from ALL members of a population • Extremely difficult to accomplish unless the population is small or data is already stored somewhere and easy to access • Time consuming
Types of Statistical Studies: • Experiment: A treatment is applied to a population and their responses are recorded. • Treatment does NOT mean drugs! • Best way to determine cause and effect
Types of Statistical Studies: • Observational Study: we observe individuals and measure variables of interest but do not attempt to influence the responses • Can interview the subjects but are not changing any conditions
Types of Statistical Studies: • Sampling: Studying a portion (sample) of the population to learn about the entire population
Example 1:Consider the following statistical studies. What method of data collection would you use to collect data for each study? Explain why. • A study of the effect of changing flight patterns on the number of airplane accidents • A study on the effects of aspirin on preventing heart attacks • A study of the weights of all linemen in the NFL • A study of US residents’ approval rating of the US President • 5. A study of how fourth grade students solve a puzzle
Observational study vs. experiment • An experiment CHANGES a condition where an observational study uses information that is already there.
Example 2:Is this an observational study or an experiment? Justify your choice. • A study of child care enrolled 1364 infants in 1991 and planned to follow them through their sixth year in school. In 2003, the researchers published an article finding that “the more time children spent in child care from birth to age four-and-a-half, the more adults tended to rate them, both at age four-and-a-half and at kindergarten, as less likely to get along with others, as more assertive, as disobedient, and as aggressive.”
Population: • The entire group of individuals that we wish to collect information about. • ALWAYS define the population of interest BEFORE you start the study. • Populations can be small such as HHS Basketball players or large such as adults in the USA
Sample: • A sample is part of the population that we actually examine in order to gather information about the POPULATION. • Should be representative of the ENTIRE population is we wish to generalize our results to the population • Should be randomly chosen
Sampling vs. Census • Sampling involves studying a part to gain information about the whole • Census asks everyone in the population
Voluntary response Sample: • Consists of people who CHOOSE to respond to a general appeal • Extremely biased as the people who usually respond feel very strongly about the issue
Convenience Sample: • Choosing individuals for a sample who are easily reached • Again, very biased!
Example 3: Which type of sample is used in each of the following? What are potential problems with each sample? • A manufacturer has representatives at the mall asking shoppers their opinions of various products. • You are on your favorite website and are asked to complete an online survey. • You leave a pile of surveys in the office, lunch room, gym, and student store. Interested people can pick one up, fill it out and return it to a box. • You select students who are in your statistics class to represent the HHS population.
Bias: • Systematically favoring certain outcomes • Examples: leaving out a segment of a population, poor survey questions, etc
Simple Random Sample(SRS): • Consists of n individuals from the population chosen in such a way that: • Each individual has an equal chance of being chosen • All samples of size n have an equal chance of being selected
Random Digits: • A table of random digits is a long string if the digits 0 – 9 with 2 conditions: • 1. each entry in the table is equally likely to be any of the digits 0-9. • 2. the entries are independent of each other; knowing one part of the table doesn’t help you know another part of the table.
4 Steps for drawing a sample • Label each individual: Use as few digits as possible, order doesn’t matter, try to do alphabetical if possible • Enter the random digit table anywhere. Standard practice read across the rows and down like a book; break the digits into chunks, drop numbers that are too large or repeated
4 Steps for drawing a sample • Stop when you have required number of individuals for your sample. • Identify the sample- list the members by name if possible
Other Sampling methods: • Probability Sample: • A sample chosen by chance. We must know what samples are possible and what the chances are of each sample being drawn.
Stratified Random Sample: • 1. Divide the population into groups (strata) based on some common characteristic that is important to the response variable. • 2. Choose a separate SRS from EACH strata and combine to form the final sample.
Cluster Sample: • 1. Divide the population into groups (clusters) usually occurs naturally • 2. Choose a SRS of clusters and survey ALL members of the selected clusters.
Systematic Sample: • Organize the population such as an alphabetical list or a line at lunch, etc.) • Choose every kth person to interview
Multi-Stage sample: • Combines 2 or more methods to draw final sample
Undercoverage: • Occurs when some groups in the population are left out of the process of choosing the sample
Nonresponse: • Occurs when an individual chosen for the sample can’t be contacted or does not cooperate
Response Bias: • The behavior of the researcher or the individual being interviewed affects the responses.
Question Wording Bias: • The way the question is worded affects how individuals respond. • The wording of the questions is the most important influence on an individual’s response. • Confusing or leading questions can introduce STRONG bias
Example 6: • A journalism class prints a survey in their school newspaper about the school charging students for parking. Readers are asked to clip the survey from the paper, complete it, and return it to a drop box in the cafeteria. Why is this sample biased?
Example 6: • A journalism class stations pollsters in front of the stadium during a football game. They ask each student who enters his or her opinion of the quality of the schools’ athletic program. Why is this sample biased?
Example 6: • A media research company is conducting a poll on the upcoming election for city council. The firm obtains a list of all 15,000 registered voters in the council ward under consideration and a simple random sample is chosen. Over a 24-hour period, telephone calls are placed to the voters, with follow-up calls made to those who did answer the initial phone call. Pollsters could not reach 37 of the 500 voters selected. What sources of bias exist in this survey?
Example 6: • A journalism class conducts a simple random sample of students at their school. They ask each student “given the fact that our school has won seven championships in the last five years, do you favor or oppose reducing funding for athletic programs?” What sources of bias exist in this survey?
Inferences about the population: • Despite the many practical difficulties in carrying out a sample survey, using chance to choose a sample does eliminate bias in the actual selection of the sample from the list of available individuals. But it is unlikely that results from a sample are exactly the same as for the entire population
Inferences about the population: • Properly designed samples avoid systematic bias, but their results are rarely exactly correct and they vary from sample to sample. • This is natural variation not a problem!
Inferences about the population: • We can improve our results by knowing that larger random samples give more accurate results than smaller samples. By taking a very large sample, you can be confident that the sample result is very close to the truth about the population.
Inferences about the population: • Using a probability sampling method and taking care to deal with practical difficulties reduce bias in a sample. The size of the sample then determines how close to the population truth the sample result is likely to fall.
Basic Vocabulary: • Experimental Units: the individuals on which the experiment is done • Subjects: experimental units that are humans • Treatment: A specific condition applied to the experimental units • Factors: The explanatory variables in an experiment • Levels: The various values of a factor
Example 7 • A pharmaceutical company wishes to test a new medication it thinks will reduce cholesterol. A group of 20 volunteers is formed and each has his or her cholesterol level measured. Half is randomly assigned to take the new drug and the other half is given a placebo. After 6 months the volunteers’ cholesterol levels are measured again and any change from the beginning of the study recorded. Identify: • A. The experimental units • b. the factors and their levels • c. the treatments • d. the response variable
Example 7: • An agricultural researcher is interested in determining how much water and fertilizer are optimum for growing a certain plant. Twenty-four plots of land are available to grow the plant. The researcher will apply three different amounts of fertilizer (low, medium, and heavy) and two different amounts of water (light and heavy). These will be applied at random in equal combination to each of four plots. Identify: • A. The experimental units • b. the factors and their levels • c. the treatments • d. the response variable
Key Elements of a Valid Experiment • Control: the principle that the potential sources of variation due to variables not under consideration must be reduced or eliminated. Control is achieved by making the conditions of the experiment as identical as possible for all experimental units.
Control Group: • the group of experimental units that receive the placebo or no treatment; they are important since we need something to compare the responses of the treatment group to in order to compare the effects of the treatment.
Placebo Effect: • subjects show a response to a treatment simply because they are receiving an imposed treatment regardless of its actual effect. I.e they think they are feeling better because they took “medicine”
Blinding: • a type of control in which the subjects do not know if they are receiving a treatment or a placebo. Blinding helps to reduce or hopefully eliminate the placebo effect.
Double-blinding: • Another control technique in which BOTH researcher and the subjects do not know who has the treatment and who has the placebo. This helps to eliminate any bias that occurs during the measurement of the response variable. In other words, the person conducting the “interview” cannot influence the subjects’ answers. This is preferred method but is impossible sometimes due to the nature of the experiment.
Hawthorne Effect: • Experimental Units change their behavior simply because they know they are participating in a study.