180 likes | 341 Views
3.1b Correlation Target Goal: I can determine the strength of a distribution using the correlation. D2 h.w: p 160 – 14 – 18, 21, 26. Scatterplot. Recall: Scatterplot reveals the strength, direction, and form for 2 quantitative variables. Two scatterplots of the same data:.
E N D
3.1b CorrelationTarget Goal: I can determine the strength of a distribution using the correlation. D2 h.w: p 160 – 14 – 18, 21, 26
Scatterplot Recall: Scatterplot reveals the strength, direction, and form for 2 quantitative variables.
Two scatterplots of the same data: • The straight-line pattern in the lower plot appears stronger because of the surrounding white space. • Our eyes are not good judges. We need a numerical measure to supplement graphs.
Correlation (r) • Measures the and of the linear relationship • The formula for the correlation r between x and y is: direction strength between two variables.
The average of the products of the x and y values for n people. standardized
Exercise: Classifying Fossils • The data gives the lengths of two bones in five fossil specimens of the extinct beast Archaeopteryx. Femur: 38 56 59 64 74 Humerus: 41 63 70 72 84 • Enter data into L1 and L2.
Find the correlation r step-by-step. • Find the mean and the standard deviation of the femur and humerus lengths. • Then find the five standardized values of each variable by using the formula for r. • Use STAT: CALC; 2-VAR Stats L1, L2 to find the following:
Use the formula to find the correlation r step-by-step. Refer to formula. • x bar = 58.2 • Sx = 13.2 • y bar = 66.0 • Sy = 15.89 Calculate r by hand.
Interpreting Correlation • Correlation: makes no distinction between explanatory and response variable. • Correlation requires both variables be quantitative. • Because r uses the standardized values of the observations: r does not change when we change the unit measure of x,y, or both,r itself has no unit of measure.
4. Positive r indicates: positive association between variables. Negative r indicates: negative assoc. between variables.
5. Correlation r is always • a number between -1 and 1 • values of r near 0 indicate a very weak linear relationship • as r moves away from 0 toward either -1 or 1: the strength of the linear relationship increases • values of r close to -1 or 1: indicate that the points in a scatterplot lie close to a straight line • extreme values of r = -1 and r = 1 occur only in the case of: a perfect linear relationship
6. Correlation measures the strength of only a linearrelationship between two variables (not a curve). 7. Like the mean and standard deviation, the correlation r: is not resistant(use r with caution when outliers appear).
Remember: correlation is not a complete description of two-variable data. Also include the means and standard deviations of both x and y.
Exercise: More Archaeopterx The data gives the lengths of two bones in five fossil specimens of the extinct beast. You found the correlation r in ex. • r = 0.994
Make a scatterplot if you did not so earlier. Explain why the value of r matches the scatterplot. (3 min) r = 0.994 The plot shows a strong positive linear relationship, with little scatter, so we expect that r is close to 1.
The lengths were measured in centimeters. If we changed to inches, how would r change? (There are 2.54 centimeters in an inch.) r would not change – it is computed from standardized values.