100 likes | 108 Views
4.2 Correlation. The Correlation Coefficient r Properties of r. Correlation.
E N D
4.2 Correlation • The Correlation Coefficient r • Properties of r
Correlation • We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here are both scatterplots of the same data, on different scales. The second seems to be a stronger association… • So we need a measure of association independent of the graphics…
Measuring Linear Association A scatterplot displays the strength, direction, and form of the relationship between two quantitative variables. Linear relations are important because a straight line is a simple pattern that is quite common. Our eyes are not good judges of how strong a relationship is. Therefore, we use a numerical measure to supplement our scatterplot and help us interpret the strength of the linear relationship. The correlation rmeasures the strength of the linear relationship between two quantitative variables.
Measuring Linear Association We say a linear relationship is strong if the points lie close to a straight line and weak if they are widely scattered about a line. The following facts about r help us further interpret the strength of the linear relationship. • Properties of Correlation • r is always a number between –1 and 1. • r > 0 indicates a positive association. • r < 0 indicates a negative association. • Values of r near 0 indicate a very weak linear relationship. • The strength of the linear relationship increases as r moves away from 0 toward –1 or 1. • The extreme values r = –1 and r = 1 occur only in the case of a perfect linear relationship.
The correlation coefficient r Time to swim: = 35, sx = 0.7 Pulse rate: = 140 sy = 9.5
r = -0.75 r = -0.75 "Time to swim" is the explanatory variable here, and belongs on the x axis. However, in either plot r is the same (r=-0.75). r does not distinguish between x & y The correlation coefficient, r, treats x and y symmetrically
r = -0.75 z-score plot is the same for both plots r = -0.75 r has no unit of measure (unlike x and y) Changing the units of measure of variables does not change the correlation coefficient r, because we "standardize out" the units when getting z-scores. z for time z for pulse
Cautions: • Correlation requires that both variables be quantitative. • Correlation does not describe curved relationships between variables, no matter how strong the relationship is. • Correlation is not resistant. r is strongly affected by a few outlying observations. • Correlation is not a complete summary of two-variable data.
HW: Read section 4.2 on the Correlation Coefficient. Pay particular attention to the Figure 4.12… Work the following exercises: #4.36-4.38, 4.41-4.44, 4.47-4.49