130 likes | 147 Views
The Question of Causation. 4.2:Establishing Causation AP Statistics. Beware the post-hoc fallacy. “Post hoc, ergo propter hoc.”
E N D
The Question of Causation 4.2:Establishing CausationAP Statistics
Beware the post-hoc fallacy • “Post hoc, ergo propter hoc.” • To avoid falling for the post-hoc fallacy, assuming that an observed correlation is due to causation, you must put any statement of relationship through sharp inspection. • Causation can not be established “after the fact.” It can only be established through well-designed experiments. {see Ch 5}
Confounding: • x may cause y, but y may instead be caused by a confounding variable z • CommonResponse: • x and y are reacting to a lurking variable z • Causation: • x causes y Explaining Association • Strong Associations can generally be explained by one of three relationships.
Causation • Causation is not easily established. • The best evidence for causation comes from experiements that change x while holding all other factors fixed. • Even when direct causation is present, it is rarely a complete explanation of an association between two variables. • Even well established causal relations may not generalize to other settings.
Common Response • “Beware the Lurking Variable” • The observed association between two variables may be due to a third variable. • Both x and y may be changing in response to changes in z. • Consider the “Presbyterian Minister and Barrels of Rum” example...do ministers actually cause people to drink more rum?
Confounding • Two variables are confounded when their effects on a response variable cannot be distinguished from each other. • Confounding prevents us from drawing conclusions about causation. • We can help reduce the chances of confounding by designing a well-controlled experiment.
Exercises • For Exercises 4.41 – 4.47, carry out the instructions. • Then state whether the relationship between the two variables involves causation, common response, or confounding. • Identify possible lurking variable(s). • Draw a diagram of the relationship in which each circle represents a variable. • Write a brief description of the variable by each circle.
Exercises • 4.41: There is a high positive correlation: nations with many TV sets have higher life expectancies. Could we lengthen the life of people in Rwanda by shipping them TVs? • 4.42: People who use artificial sweeteners in place of sugar tend to be heavier than people who use sugar. Does artificial sweetener use cause weight gain? • 4.43: Women who work in the production of computer chips have abnormally high numbers of miscarriages. The union claimed chemicals cause the miscarriages. Another explanation may be the fact these workers spend a lot of time on their feet.
Exercises-con’t • 4.44: People with two cars tend to live longer than people who own only one car. Owning three cars is even better, and so on. What might explain the association? • 4.45: Children who watch many hours of TV get lower grades on average than those who watch less TV. Why does this fact not show that watching TV causes low grades?
Exercises-con’t • 4.46: Data show that married men (and men who are divorced or widowed) earn more than men who have never been married. If you want to make more money, should you get married? • 4.47: High school students who take the SAT, enroll in an SAT coaching course, and take the SAT again raise their mathematics score from an average of 521 to 561. Can this increase be attributed entirely to taking the course?
Half Sheet Quiz! • A study of elementary school children, ages 6-11, finds a high positive correlation between shoe size x and score y on a test of reading comprehension. • Then, in your own words, explain the meaning of causation, common response, and confounding
Other Concerns • Extrapolation: The use of a regression line or curve for a prediction outside the domain of the explanatory variable. • These predictions cannot be trusted • Averaged Data: Many regression or correlation studies work with averages or other measures that combine information from many individuals • These results cannot be applied to individuals - they can only be applied to the averaged units • One of the advantages of averaging data is that it takes care of some lurking variables • Correlations based on averages are typically too high when applied to individuals
REMEMBER! • Correlation does NOT imply causation!