150 likes | 226 Views
Chapter 7. Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the age and weight of these adults?. Suppose we found the height and weight of a sample of 10 adults.
E N D
Chapter 7 Correlation
Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the age and weight of these adults?
Suppose we found the height and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the height and weight of these adults? Is it positive or negative? Weak or strong?
The closer the points in a scatterplot are to a straight line - the stronger the relationship. The farther away from a straight line – the weaker the relationship
Identify as having a positive association, a negative association, or no association. + • Heights of mothers & heights of their adult daughters - • Age of a car in years and its current value + • Weight of a person and calories consumed • Height of a person and the person’s birth month NO • Number of hours spent in safety training and the number of accidents that occur -
Correlation Coefficient (r)- • A quantitative assessment of the strength & direction of the linear relationship between bivariate, quantitative data • Pearson’s sample correlation is used most • parameter - r (rho) • statistic - r
Calculate r. Interpret r in context. There is a strong, positive, linear relationship between speed limit and average number of accidents per week.
Strong correlation No Correlation Moderate Correlation Weak correlation Properties of r(correlation coefficient) • legitimate values of r is [-1,1]
value of r does not depend on the unit of measurement for either variable x (in mm) 12 15 21 32 26 19 24 y 4 7 10 14 9 8 12 Find r. Change to cm & find r. The correlations are the same.
value of r does not depend on which of the two variables is labeled x x 12 15 21 32 26 19 24 y 4 7 10 14 9 8 12 Switch x & y & find r. The correlations are the same.
value of r is non-resistant x 12 15 21 32 26 19 24 y 4 7 10 14 9 8 22 Find r. Outliers affect the correlation coefficient
value of r is a measure of the extent to which x & y are linearly related Find the correlation for these points: x -3 -1 1 3 5 7 9 Y 40 20 8 4 8 20 40 Sketch the scatterplot r = 0, but has a definite relationship!
Association vs. Causation • In a famous example of a correlation study, the following results were obtained. • Number of Methodist Ministers Cuban Rum Imported to Boston • Year in New England (in barrels) • ---------------------------------------------------------------------------------------------------------------- • 1860 63 8376 • 1865 48 6406 • 1870 53 7005 • 1875 64 8486 • 1880 72 9595 • 1885 80 10,643 • 1890 85 11,265 • 1895 76 10,071 • 1900 80 10,547 • 1905 83 11,008 • 1910 105 13,885 • 1915 140 18,559 • 1920 175 23,024 • 1925 183 24,185 • 1930 192 25,434 • 1935 221 29,238 • 1940 262 34,705 • The correlation coefficient for this relationship is r = .999986.
r = .9999 So does an increase in ministers cause an increase in consumption of rum?
Correlation does not imply causation Correlation does not imply causation Correlation does not imply causation