220 likes | 371 Views
2.6 The Question of Causation. The goal in many studies is to establish a causal link between a change in the explanatory variable and a change in the response variable.
E N D
The goal in many studies is to establish a causal link between a change in the explanatory variable and a change in the response variable. We learned about the lurking variable in the previous section. This is known as common response; that is, when the observed association between the variables x and y is explained by a lurking variable z. There are other ways in which a high correlation may not imply causation.
Reverse Causation Reverse causation is when one reverses the explanatory and response variable. Here is an controversial real world example of what some claim is reverse causation from Al Gore’s 2006 documentary “An Inconvenient Truth.” Taken from http://www.youtube.com/watch?v=lWRqQ_iI7qQ&feature=related
Skeptics of Gore’s position claim that he has committed the reverse causation fallacy i.e. that it is actually a rise in temperature that increases Co2 emissions and not vice-versa. Indeed, while Gore’s conclusion may be correct, his reasoning (at least as it is presented in the short clip) is faulty.
This leads to another important point- look with a critical eye at any study, even ones that have conclusions that you agree with. If you don’t find the faults, someone on the “other” side will!
Coincidence Continuing with the example of the earth warming, the following is an example of a correlation/causation fallacy known as coincidence: There has been an increase in the earth’s temperature and a decrease in the number of pirates over the same time period. Therefore, the earth’s warming is caused by a lack of pirates.
In the above example, it is obvious that we have a coincidence. But in general, it can be difficult to say with any sort of confidence that a study does not commit some kind of correlation/causation fallacy. “The best evidence for causation comes from experiments that actually change x while holding all other factors fixed. If y changes, we have good reason to think that x caused the change in y.” The textbook, p. 155
Confounding Another problem in establishing causation is the problem of confounding; that is, when the effects of two variables on a response variable can not be distinguished from each other. For example, consider Kobe Bryant, arguably the best basketball player in the world right now. He is also known as one of the hardest working students of the game. Is he good because of his raw talent? Or because he works so hard? Or some combination? In general, the idea behind confounding is that it may not be possible to make some sort of separation between two variables.
Establishing Causation We have seen many potential pitfalls in attempting to establish causation between two variables. Ideally, we would like to setup a careful experiment in which lurking variables are controlled. We will look more at designing an experiment in Chapter 3.
However, setting up an experiment may not only be impossible, but also unethical. What should we do about immigration? We can’t set up several regions of the country, one where any kind immigration is legal, one where it is completely illegal, and several others in between. Or consider the possible health risks of smoking. It would be unethical to make a whole group smoke.
Some criteria to establish correlation without experiments Strong association Consistent association The alleged cause precedes the effect in time. Higher doses (if applicable) are associated with stronger responses. The alleged cause is plausible.
Suppose we wish to study the effectiveness of a new workout supplement which claims to increase endurance. We could recruit two groups of people; one group which would like to try the supplement and one which does not. We may then study the average distance covered in a 30 minute run of the two groups and compare them.
Some terms • The individuals being studied are called subjects. (non-human subjects are called experimental units) • The treatment is that which is being studied. • The treatment group is the group possessing the treatment. • The control group is the group not possessing treatment (if such a group exists). E.g. if you are studying sleep patterns and you have three groups sleep 4, 6, and 8 hours, they are all treatment groups.
In the above example, the supplement is the treatment, the group who took the supplement is the treatment group, and the group who did not take the supplement is the control group.
Problems • Those who took the supplement may have a psychological advantage over those who did not. Or they might assume that the supplement will give them an extra boost and not try as hard as they normally would. Or those who take the supplement may be taking other supplements which could make a difference in overall performance.
Solutions • We should divide the participants up into groups and decide which group gets the supplement and which does not. This is a controlled experiment because we have control over which subjects obtain the treatment. When we do not have control, it is known as an observational study. • One way to divide the participants into groups is by random assignment. This is called a randomized control experiment.
More Solutions • Even if we choose who takes the supplement, those who take the supplement KNOW that they take it, and those who don’t take it KNOW that they don’t, which could be a confounding factor. • To gerrymander this problem, we introduce a placebo; that is, we give the group who is not taking the supplement a pill or shot or powder which does nothing. • When the participants do not know who is taking the treatment and who is taking the placebo, this is known as a blind study. When both the participants and the researchers do not know, this is known as a double-blind study.
Principles of Experimental Design Compare two or more treatments. Randomize. Repeat. We hope to see a difference in the response so large that it is unlikely to happen by chance variation. An observed effect so large that it would rarely occur by chance is statistically significant.
Random Samples • A random sample is a sample obtained from the population in such a manner that all samples of the same size have equal likelihood of being selected. The use of chance to divide experimental units into groups is called randomization. • E.g. Lottery method, random number method • When sampling is done in such a way that there are no repetitions in the sample, the result is a simple random sample.
More Ways to design Matched pair design compares two treatments. In our example above, we may pair subjects up based on age, sex, or some athletic ability. More generally, a block is a group of experimental units or subjects that are known before the experiment to be similar in some way that is expected to affect the response to the treatments. In a block design, the random assignment of units to treatments is carried out separately within each block.