280 likes | 493 Views
Preliminaries Introduction to Statistical Investigations. Statistics vs. Anecdotal Evidence. Smoking causes cancer . Seat belts save lives. Do Vaccines Cause Autism?. Nelson says it wasn't long after her son Parker's shots at 15 months that she noticed something was wrong.
E N D
Statistics vs. Anecdotal Evidence Smoking causes cancer. Seat belts save lives.
Do Vaccines Cause Autism? Nelson says it wasn't long after her son Parker's shots at 15 months that she noticed something was wrong. "He had run a slight fever after the vaccinations, but I didn't think anything of it," said Nelson. “… about a week after that he just completely stopped talking." After months of worrying, wondering, and going back and forth with doctors, an official diagnosis was made: autism. Nelson believes it started with the vaccines. "Gradually, I started piecing it together. He got sick after his vaccinations and about a week later everything changed. He was a completely different little boy then," said Nelson. http://www.wsaz.com/charleston/headlines/19376044.html
Statistics • Scientific conclusions cannot be based on anecdotal evidence. We need evidence from data. • Statistics is the science of producing useful data to address a research question, analyzing the resulting data, and drawing appropriate conclusions from the data.
Research Hypothesis 1. Ask a research question 2. Design a study and collect data 3. Explore the data 4. Draw inferences Significance Estimation Logic of Inference 5. Formulate conclusions Scope of Inference Generalization Causation 6. Look back and ahead Six-Step Statistical Investigation Method
Example P.1: Organ Donations • While a majority of people approve of organ donation in principle, far less than that actually sign up when getting a driver’s license. • Different states have different recruiting methods. • Do these different methods result in different sign-up rates?
Recruiting Organ Donors Step 1. Ask a Research Question • In general: Is there a method that will increase the likelihood that a person agrees to become an organ donor. • More specifically: Does the default option presented to driver’s license applicants influence the likelihood of someone becoming an organ donor?
Recruiting Organ Donors Step 2:Design a study and collect data • The researchers decided to recruit various participants and ask them to pretend to apply for a new driver’s license. • The participants did not know in advance that different options were given for the donor question, or even that this issue was the main focus of the study. • They offered an incentive of $4.00 for completing an online survey. After the results were collected, the researchers removed data arising from multiple responses from the same IP address, surveys completed in less than five seconds, and respondents whose residential address could not be verified.
Recruiting Organ Donors Step 2:Design a study and collect data • Some of the participants were forced to make a choice of becoming a donor or not, without being given a default option (the “neutral” group). • Other participants were told that the default option was not to be a donor but that they could choose to become a donor if they wished (the “opt-in” group). • The remaining participants were told that the default option was to be a donor but that they could choose not to become a donor if they wished (the “opt-out” group).
Recruiting Organ Donors Step 3:Explore the data. • 44 of the 56 (78.6%) participants in the neutral group agreed to become organ donors, • 23 of 55 (41.8%) participants in the opt-in group agreed to become organ donors, and • 41 of 50 (82.0%) participants in the opt-out group agreed to become organ donors.
Recruiting Organ Donors Step 4:Draw inferences beyond the data. • Using methods that you will learn in this course, the researchers analyzed whether the observed differences between the groups was large enough to indicate that the default option had a genuine effect. • In particular, they reported strong evidence that the neutral and opt-out versions do lead to a higher chance of agreeing to become a donor, as compared to the opt-in version currently used in many states. • In fact, they could be quite confident that the neutral version increases the chances that a person agrees to become a donor by between 20 and 54 percentage points, a difference large enough to save thousands of lives per year in the United States.
Recruiting Organ Donors Step 5: Formulate conclusions. • Based on the analysis of the data and the design of the study, it is reasonable for these researchers to conclude that the neutral version causes an increase in the proportion who agree to become donors. • But because the participants in the study were volunteers recruited from internet bulletin boards, generalizing conclusions beyond these participants is only legitimate if they are representative of a larger group of people.
Recruiting Organ Donors Step 6:Look back and ahead. • One limitation of the study is that participants were asked to imagine how they would respond, which might not mirror how people would actually respond in such a situation. • A new study might look at people’s actual responses to questions about organ donation or could monitor donor rates for states that adopt a new policy. • Researchers could also examine whether presenting educational material on organ donation might increase people’s willingness to donate. • Another improvement would be to include participants from wider demographic groups than these volunteers.
Terminology • The individual entities on which data are recorded are called observational units. • The recorded characteristics of the observational units are the variablesof interest. • Variables can be: • Quantitative • You can add, subtract, etc. with the values. • Height, weight, distance, time… • Categorical • Labels for which arithmetic does not make sense. • Sex, ethnicity, eye color… • What are the observational units and variables in the Organ Donation Study?
More Terminology • The distribution of variable describes the pattern of value/category outcomes. • For the organ donation study the bar chart shown displays the distribution of responses.
Old Faithful Example P.2
Old Faithful • How faithful is Old Faithful? • Can the time of the next eruption be accurately predicted?
Old Faithful • Researchers collected data on 222 eruptions taken over a number of days in the summers of 1978 and 1979. • The results are shown in a dotplot.
Old Faithful • What are the observational units and variable in this study? • Is the variable quantitative or categorical? • We can see from the dotplot that Old Faithful is not perfectly predictable. • The time until the next eruption varies from eruption to eruption. • This variabilityis the most fundamental property in studying Statistics. Without variability, we wouldn’t need statistics.
Old Faithful • Let’s take another look at the dotplot and describe the distribution. • What could be some explanations for the variability?
Old Faithful • One explanation could be the duration of previous eruption (short: < 3.5 min. or long > 3.5 min.)
Old Faithful Summer 2005
Old Faithful • One way to measure the center of a distribution is with the average, also called the mean. • One way to measure variability is with the standard deviation, which is roughly the average distance between a data value in the distribution and the mean of the distribution
Old Faithful Basic Terminology • Some aspects to look for in a distribution of a quantitative variable are: • Shape: Is the distribution symmetric? Mound-shaped? Are there several peaks or clusters? • Center: Where is the distribution centered? What is a typical value? • Variability: How spread out are the data? Are most within a certain range of values? • Unusual observations: Are there outliers that deviate markedly from the overall pattern of the other data values? Are there other unusual features in the distribution?
Exploration P.3: Cars or Goats Pages P-13 to P-17