100 likes | 274 Views
Scatterplots, Association, and Correlation. Ch. 7. Scatterplot. When to use: Number of variables: 2 Data type: quantitative data Purpose: investigate the relationship between variables x-axis is the explanatory variable y-axis is the response variable. Relationship. Association.
E N D
Scatterplot • When to use: Number of variables: 2 Data type: quantitative data Purpose: investigate the relationship between variables x-axis is the explanatory variable y-axis is the response variable
Relationship Association Correlation The underlying form of the association between two variables is linear • The variables are somehow (statistically )linked
Describing the Relationship • Direction – positive or negative • Form – linear, curved, something else • Strength – strong (little scatter), moderate (some scatter), weak (much scatter) • Unusual features – outliers, clumps, etc.
(Potential) Outliers • A point that lies away from the rest of the data • There is no rule for determining outliers • Can have a large impact on the analysis • Make no relationship look strong • Make a strong relationship look weak • Do the analysis with and without outliers
Correlation Conditions Quantitative Variables Condition Both variables must be quantitative variables Straight Enough Condition The form of the scatterplot must be straight enough that a linear relationship makes sense Outlier Condition Report the correlation with the outlier and without the outlier
Correlation Coefficient • The strength of a linear relationship is given by the correlation coefficient Don’t compute it by hand
Correlation Properties • Always between -1 and +1 • The sign of the correlation coefficient gives the direction of the association • The correlation of x with y is the same as the correlation of y with x • Unitless • Not affected by changes in center or scale of the variables • Measures the strength of a linear relationship • Sensitive to outliers