200 likes | 384 Views
Correlation. Minium, Clarke & Coladarci, Chapter 7. Association. Univariate vs. Bivariate one variable vs. two variables When we have two variables we can ask about the degree to which they co-vary
E N D
Correlation Minium, Clarke & Coladarci, Chapter 7
Association • Univariate vs. Bivariate • one variable vs. two variables • When we have two variables we can ask about the degree to which they co-vary • is there any relationship between an individual’s score on one variable and his or her score on a second variable • number of beers consumed and reaction time (RT) • Number of hours of studying and score on an exam • Years of education and salary • Parent’s anxiety (or depression) and child anxiety (or depression)* • The correlation coefficient • “a bivariate statistic that measures the degree of linear association between two quantitative variables. • The Pearson product-moment correlation coefficient
Scatter diagram Graph that shows the degree and pattern of the relationship between two variables Horizontal axis Usually the variable that does the predicting (this is somewhat arbitrary) e.g., price, studying, income, caffeine intake Vertical axis Usually the variable that is predicted e.g., quality, grades, happiness, alertness Bivariate Distributions and Scatterplots
Steps for making a scatter diagram Draw axes and assign variables to them Determine the range of values for each variable and mark the axes Mark a dot for each person’s pair of scores Bivariate Distributions and Scatterplots
Linear correlation Pattern on a scatter diagram is a straight line Example above Curvilinear correlation More complex relationship between variables Pattern in a scatter diagram is not a straight line Example below Bivariate Distributions and Scatterplots
Positive linear correlation High scores on one variable matched by high scores on another Line slants up to the right Negative linear correlation High scores on one variable matched by low scores on another Line slants down to the right Bivariate Distributions and Scatterplots
Zero correlation No line, straight or otherwise, can be fit to the relationship between the two variables Two variables are said to be “uncorrelated” Bivariate Distributions and Scatterplots
a. Negative linear correlation b. Curvilinear correlation c. Positive linear correlation d. No correlation Bivariate Distributions and Scatterplots
Covariance is a number that that reflects the degree and direction of association between two variables. This is the definition Note its similarity to the definition of variance (S2) The logic of the Covariance The Covariance
Example (Positive Correlation) (see p. 109) Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym) A 9 13 4 4 16 B 7 9 2 0 0 C 5 7 0 -2 0 D 3 11 -2 2 -4 E 1 5 -4 -4 16 n=5 Xm=5 Ym=9 sum = 28 Cov = 28/5=5.6 The Covariance
Example (Negative Correlation) (see p. 109) Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym) A 9 5 4 -4 -16 B 7 11 2 2 4 C 5 7 0 -2 0 D 3 9 -2 0 0 E 1 13 -4 4 -16 n=5 Xm=5 Ym=9 sum = -28 Cov = -28/5=-5.6 The Covariance
Example (Zero Correlation) (see p. 109) Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym) A 9 13 4 2.8 11.2 B 7 9 2 -1.2 -2.4 C 5 7 0 -3.2 0.0 D 3 9 -2 -1.2 2.4 E 1 13 -4 2.8 -11.2 n=5 Xm=5 Ym=10.2 sum = 0 Cov = 0/5 = 0 The Covariance
Correlation coefficient, r, indicates the precise degree of linear correlation between two variables Can vary from -1 (perfect negative correlation) through 0 (no correlation) to +1 (perfect positive correlation) r is more useful than Cov because it is independent of the underlying scales of the two variables if two variables produce an r of .5, for example, r will still equal .5 after any linear transformation of the two variables linear transformation: adding, subtracting, dividing or multiplying by a constant e.g., converting Celsius to Fahrenheit: F = 32 + 1.8C e.g., converting Fahrenheit to Celsius: C = (F - 32) /1.8 The Pearson r: the Pearson product-moment coefficient of correlation
The Pearson r: the Pearson product-moment coefficient of correlation r = .81 r = -.75 r = .46 r = -.42 r = .16 r = -.18
When two variables are correlated, three possible directions of causality 1st variable causes 2nd 2nd variable causes 1st Some 3rd variable causes both the 1st and the 2nd There is inherent ambiguity in correlations Correlation and Causality
When two variables are correlated, three possible directions of causality 1st variable causes 2nd 2nd variable causes 1st Some 3rd variable causes both the 1st and the 2nd Inherent ambiguity in correlations Correlation and Causality
Linearity Outliers “To the extent that a bivariate distribution departs from linearity, r will underestimate that relationship.” (p.121) “Discrepant data points, or outliers, affect the magnitude of r and the direction of the effect depending on the outlier’s location in the scatterplot.” (p. 122). Factors influencing the Pearson r
Restriction of Range “Other things being equal, restricted variation in either X or Y will result in a lower Pearson r and would be obtained were variation greater.” (p. 122) Factors influencing the Pearson r
Factors influencing the Pearson r • Context • “Because of the many factors that influence r, there is no such thing as the correlation between two variables. Rather, the obtained r must be interpreted in full view of the factors that affect it and the particular conditions under which it was obtained.” (p. 124)
Judging the Strength of Association • r2: proportion of common variance • The coefficient of determination, r2, is the proportion of common variance shared by two variables. • We will talk more about this when we discuss Chapter 8.