Causation: Can We Say What Caused the Effect?

Causation: Can We Say What Caused the Effect? Sections 4.1 and 4.2

Big Idea of Chapter 4 • Previously research questions focused on one proportion • What proportion of the time did Buzz guess the right button? • What proportion of the time did Marine guess the right bag? • We will now start to focus on research questions comparing two groups. • Are smokers more likely than nonsmokers to have lung cancer? • Are children who used night light as infants more likely to need glasses than those that don’t use night lights?

Types of Variables • When two variables are involved in a study, they are often classified as explanatory and response • Explanatory variable(Independent, Predictor) • The variable we think is “explaining” the change in the response variable. • Response variable(Dependent) • The variable we think is being impacted or changed by the explanatory variable.

Roles of Variables • Choose the explanatory and response variable: • Smoking and lung cancer • Heart diseaseand Diet • Hair color and eye color • Sometimes there is a clear distinction between explanatory and response variables and sometimes there isn’t.

November 7, 2008 12:55 p.m.To the Hope College Campus Community:Within the past few minutes the College Administrators have met with county health officials who issued the letter below with an emergency order. This order is mandatory under Michigan law.Because of the growing number of reported illnesses today, health officials have changed their earlier advice to follow good hygiene practices toa cancellation of all campus activities effectively immediately. It is strongly recommended by health officials that students refrain from travel and meeting in large groups. President James E. Bultman

Two-Way Table • Let’s look to see if there is a difference in the incidents of contracting the norovirus in males versus females. • Note that we are now comparing one variables against another instead of comparing one variable against some standard. • Which is the explanatory variable? Number of students with norovirus by sex

Explanatory and Response Explanatory (columns) Response (rows)

Counts versus Proportions • We can’t use counts to compare the prevalence of norovirus among the two genders, we need to use conditional proportions or percentages. When the explanatory variable creates the columns, look at column percentages. • By looking at the percentages from our sample, who was more likely to get the norovirus?

Observational Studies • The norovirus study is an example of an observational study. • In observational studies researchers observe individuals and measure variables of interest • Examples: • A significantly higher proportion of individuals with lung cancer smoked compared to same-age individuals who don’t have lung cancer • College students who spend more time on Facebook tend to have lower GPAs

Observational Studies Do these studies prove that smoking causes lung cancer or Facebook causeslower GPAs? • Many people who see these types of studies think so… • It depends on the study design

Nightlights and Near-Sightedness • Near-sightedness often develops in childhood • Recent studies looked to see if there is an association between near-sightedness and nightlight use with infants • Researchers interviewed parents of 479 children who were outpatients in a pediatric ophthalmology clinic • Asked whether the child slept with the room light on, with a night light on, or in darkness before age 2 • Children were also separated into two groups: near-sighted or not near-sighted based on the child’s recent eye examination

Night-lights and near-sightedness The largest group of near-sighted kids slept in rooms with night lights? What is a better way to look at the data? Conditional proportions 18/172 ≈ 0.105 78/232 ≈ 0.336 41/75 ≈ 0.547

Night-lights and near-sightedness Notice that as the light level increases, the percentage of near-sighted children also increases.

Nightlights and near-sightedness • There is an associationbetween near-sightedness and nightlights • Can we claim that nightlights and room lights causedthe increase in near-sightedness? • Might there be other reasons for this association?

Night-lights and near-sightedness • Could parent’s eyesight be another explanation? • Maybe parents with poor eyesight tend to use more light to make it easier to navigate the room at night and parents with poor eyesight also tend to have children with poor eyesight. • Now we have a third variable of parents’ eyesight • Parents’ eyesight is considered a confounding variable.

Confounding Variables • Confounding variables are related to both the explanatory and response variable • Because of this, we can’t draw cause and effect conclusions when confounding variables are present. • Since confounding variables can be present in observational studies, we can’t conclude causation from these kinds of studies. • This doesn’t mean the explanatory variable isn’t influencing the response variable. Association may not imply causation, but can be a pretty big hint.

Observational Studies vs. Experiments • Observational studies may have confounding variables present that prevent us from determining a cause and effect. • Well designed experiments can control for confounding variables so we can determine cause and effect.

Experiments: Control for Confounding Variables • Physicians’ Health Study I (study aspirin’s affect on reducing heart attacks. • Started in 1982 with 22,071 male physicians. • The physicians were randomly assigned into one of two groups. • Half took a 325mg aspirin every other day and half took a placebo.

Results • Intended to go until 1995, the aspirin study was stopped in 1988 after finding significant results. • 189 (1.7%) heart attacks occurred in the placebo group and 104 (0.9%) in the aspirin group. (45% reduction in heart attacks for the aspirin group.) • What about confounding variables? Could the aspirin group be different than the placebo group in some other ways? • Did they have a better diet? • Did they exercise more? • Were they genetically less likely to have heart attacks? • Were they younger?

The big idea • Confounding variables are controlled in experiments due to the random assignment of subjects to treatment groups. • Randomly assigning people to groups tends to balance out all other variables between the groups. • So variables that could have an effect on the response should be equalized between the two groups and therefore should not be confounding • Thus, cause and effect conclusions are possible in experiments through random assignment. (It must be a well run experiment.)

Random vs. Random • With observational studies, random sampling is often done. This allows us to make inferences from the sample to the population where the sample was drawn. • With experiments, random assignment is done. This allows us to conclude causation.

Blocking and Random Assignment • The goal in random assignment is to make two groups as similar as possible. • Sometime there are characteristics (or variables) of subjects that you can see and you can block on these variables. • For example, if our subjects consist of 60% females and 40% males, we can force our two groups to both consist of 60% female and 40% male. • Let’s look at an applet to see what blocking and random assignment does to help keep both groups as similar as possible.

Exploration 4.1 • In Exploration 4.1 we will see if playing in front of a large crowd (at home) is a disadvantage for the Oklahoma Thunder compared to a smaller crowd? • HW: Do exercises 4.1.4, 4.2.1-4, 4.2.6.

Paired Designs Section 4.3

Introduction • Variability in quantitative variables impacts the distribution of the sample statistics like . • Reducing variability in data improves inferences: • Narrower confidence intervals • Smaller p-values when the null hypothesis is false • We can sometimes reduce variability in the response variable by using an alternative study designs called paired design.

Can You Study With Music Blaring? Example 4.3

Studying with Music • Many students study while listening to music. • Does it hurt their ability to focus? • In “Checking It Out: Does music interfere with studying?” Stanford Prof Clifford Nassclaims the human brain listens to song lyrics with the same part that does word processing • Instrumental music is, for the most part, processed on the other side of the brain and Nass claims that reading and listening to instrumental music has virtually no interference.

Studying with Music Consider the experimental designs: • Experiment 1—Random assignment to 2 groups • 27 students were randomly assigned to 1 of 2 groups: • One group listens to music with lyrics • One group listens to music without lyrics • Students play a memorization game while listening to the music. • Experiment 2—Paired design • All students play the memorization game twice: • Once while listening to music with lyrics • Once while listening to music without lyrics.

Studying with Music • What if everyone could remember exactly 2 more words when they listened to a song without lyrics. • There could be a lot of overlap between the two sets of scores and it would be difficult to detect a difference as shown here. Without Lyrics With Lyrics

Studying with Music • Variability in people’s memorization abilities may make it difficult to see differences between the songs in the first experiment. • The paired design focuses on the difference in the number of words memorized, instead of the number of words memorized. • By looking at this difference, the variability in general memorization ability is taken away and we may have reduced variability.

Studying with Music • While there is lots of variability in the number of words memorized between students, there would be no variability in the difference in the number of words memorized in our hypothetical example. • All values would be exactly 2. • Hence we would have extremely strong evidence of a difference in ability to memorize words between the two types of music.

Pairing and Random Assignment • Pairing often makes it easier to detect statistical significance • Can we still make cause-and-effect conclusions in paired design? • Can we still have random assignment?

Pairing and Random Assignment • Imagine that we ran the following experiment: • All students play the game listening to the song with lyrics • Then play the game a second time while listening to the song without lyrics. • If we see significant improvement in performance, is it attributable to the song? • What about experience? Could that have made the difference? • What is a better design? • Randomly assign each person to which song they hear first - with lyrics first, or without. • This cancels out an “experience” effect

Paring and Observational Studies We can use pairing in observational studies. • If you are interested in which test was more difficult in a course, the first or the second, compare the average difference in scores for each individual. • Use a Pretest and a Postest.

Exploration 4.3 • Complete questions 1 – 9 on Exploration 4.3: Rounding First Base. HW: Do exercises 4.3.1, 4.CE.2-4.

Causation: Can We Say What Caused the Effect?