280 likes | 296 Views
Chapter 5 Summarizing Bivariate Data. Correlation. Variables : Response variable (y) measures an outcome (dependent) Explanatory variable (x) helps explain or influence changes in a response variable (independent). Suppose we found the age and weight of a sample of 10 adults.
E N D
Chapter 5Summarizing Bivariate Data Correlation
Variables: Response variable (y) measures an outcome (dependent) Explanatory variable (x) helps explain or influence changes in a response variable (independent)
Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the age and weight of these adults?
Does there seem to be a relationship between age and weight of these adults?
Suppose we found the height and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the height and weight of these adults? Is it positive or negative? Weak or strong?
Does there seem to be a relationship between height and weight of these adults?
When describing relationships between two variables, you should address: • Direction (positive, negative, or neither) • Strength of the relationship (how much scattering?) • Form ( linear or some other pattern) • Unusual features (outliers or influential points) And ALWAYS in context of the problem!
Identify as having a positive association, a negative association, or no association. + • Heights of mothers & heights of their adult daughters - • Age of a car in years and its current value + • Weight of a person and calories consumed • Height of a person and the person’s birth month NO • Number of hours spent in safety training and the number of accidents that occur -
The closer the points in a scatterplot are to a straight line - the stronger the relationship. The farther away from a straight line – the weaker the relationship
Correlation measures the direction and the strength of a linear relationship between 2 quantitative variables.
Find the mean and standard deviation of the heights and weights of the 10 students:
Find the mean and standard deviation of the heights and weights of the 10 students:
Correlation Coefficient (r)- • A quantitative assessment of the strength & direction of the linear relationship between bivariate, quantitative data • Pearson’s sample correlation is used most • parameter - r (rho) • statistic - r
Calculate r. Interpret r in context. r = .9964 There is a strong, positive, linear relationship between speed limit and average number of accidents per week.
Strong correlation No Correlation Moderate Correlation Weak correlation Properties of r(correlation coefficient) • legitimate values of r is [-1,1]
x (in mm) 12 15 21 32 26 19 24 y 4 7 10 14 9 8 12 Find r. Interpret r in context. .9181 There is a strong, positive, linear relationship between speed limit and the number of weekly accidents.
value of r is not changed by any transformations x (in mm) 12 15 21 32 26 19 24 y 4 7 10 14 9 8 12 Find r. Change to cm & find r. Do the following transformations & calculate r 1) 5(x + 14) 2) (y + 30) ÷ 4 .9181 .9181 The correlations are the same. STILL = .9181
value of r does not depend on which of the two variables is labeled x Switch x & y & find r. Type: LinReg L2, L1 The correlations are the same.
value of r is non-resistant x 12 15 21 32 26 19 24 y 4 7 10 14 9 8 22 Find r. Outliers affect the correlation coefficient
value of r is a measure of the extent to which x & y are linearly related Find the correlation for these points: x -3 -1 1 3 5 7 9 Y 40 20 8 4 8 20 40 What does this correlation mean? Sketch the scatterplot r = 0, but has a definite relationship!
Correlation makes no distinction between explanatory and response variable. It is unitless. • 2) Correlation does not change when we change the units of measurement of x, y, or both. • 3) Correlation requires both variables to be quantitative. • 4) Correlation does not describe curved relationship between variables, no matter how strong. Only the • linear relationship between variables. • 5) Like the mean and standard deviation, the correlation is not resistant: r is strongly affected by a few • outlying observations.
Correlation does not imply causation Correlation does not imply causation Correlation does not imply causation