1 / 19

4.2 Cautions about Correlation and Regression

4.2 Cautions about Correlation and Regression. Correlation and regression are powerful tools, but have limitations. Correlation and regression describe only linear relationship. Correlation r and the least-squares regression are not resistant. . Extrapolation.

darice
Download Presentation

4.2 Cautions about Correlation and Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4.2 Cautions about Correlation and Regression

  2. Correlation and regression are powerful tools, but have limitations. • Correlation and regression describe only linear relationship. • Correlation r and the least-squares regression are not resistant.

  3. Extrapolation • The use of a regression line for prediction far outside the domain of the explanatory variable x that you used to obtain the line or curve.

  4. Such predictions are often inaccurate • Suppose that you have data on a child’s growth between 3 and 8 years of age. You find a strong linear relationship between age x and height y. If you fit a regression line to these data and use it to predict height at age 25 years, you will predict that the child will be 8 feet tall.

  5. Lurking Variable • A variable that is not among the explanatory or response variables in a study and yet may influence the interpretation of relationships among those variables.

  6. Remember the link between cancer and dental plaque? It could be that bad mouth hygiene is an indicator of other life style factors associated with cancer.

  7. Lurking variables continued • Lurking variables are often unrecognized and unmeasured. Detecting their effect is challenging. • Many lurking variables change systematically over time.  one useful method of detecting lurking variables is to plot both the response variable and the regression residuals against the time order of the observation. (See Example 4.12 on pg 228)

  8. Explaining Association

  9. Causation • The best evidence for causation comes from experiments that actually change x while holding all other factors fixed. If y changes, then we have a good reason to think that x caused the change in y. • Even well-established causal relations may not generalize to other settings. • Sugar substitute caused bladder tumor in rats. Should we avoid this particular sugar substitute?

  10. Common Response • The observed association between the variables x and y is explained by a lurking variable z. Both x and y change in response to changes in z. This common response creates an association even though there may be no direct causal link between x and y. • Students who are smart and who have learned a lot tend to have both high SAT scores and high college grades. The positive correlation is explained by this common response to students’ ability and knowledge.

  11. Confounding • In short, “mixing of influences.” • Two variables are confounded when their effects on a response variable cannot be distinguished from each other.

  12. Example of Confounding • It is likely that more education is a cause of higher income—many highly paid professions require advanced education. However, confounding is also present. People who have high ability and come from prosperous homes are more likely to get many years of education than people who are less able or poorer. Of course, people who start out able and rich are more likely to have high earnings even without much education. We can’t say how much of the higher income of well-educated people is actually caused by their education.

  13. Establishing Causation without Experiments • The association is strong. • The association is consistent. • Higher doses are associated with stronger responses. • The alleged cause precedes the effect in time. • The alleged cause is plausible. • See Example 4.18 Does Smoking Cause Lung Cancer (pg. 236)

  14. 4.33 FIGHTING FIRES Someone says, “There is a strong positive correlation between the number of firefighters at a fire and the amount of damage the fire does. So sending lots of firefighters just causes more damage.” Why is this reasoning wrong?

  15. 4.36 BETTER READERS A study of elementary school children, ages 6 to 11, finds a high positive correlation between shoe size x and score y on a test of reading comprehension. What explains this correlation?

  16. Try this at home Exercises 4.38, 4.41, 4.43, 4.45

More Related