Correlation

Correlation Overview and interpretation

Making a Scatterplot Line up the data in columns (eliminate missing data) Plot the student’s score on each variable Bill Adapted from Wiersma, W., & Jurs, S. G. (1990). Educational measurement and testing (2nd ed.). Needham Heights, MA: Allyn and Bacon.

Inspect the scatterplot • The correlation coefficient (Pearson r) can only be interpreted for linear relationships These are all examples of linear relationships The strength of the correlations vary Shavelson, R. J. (1996). Statistical reasoning for the behavioral sciences (Third ed.). Needham Heights, MA: Allyn & Bacon.

Inspect the scatterplot (2) • If you see these types of distributions, you are dealing with a curvilinear relationship

outlier Inspect the scatterplot (3) • Students who seem to be ‘out on their own’ in the scatter plot are called outliers • Including outliers in the calculation can change the relationship

Pearson r correlation coefficient • Range from -1.0 (perfect inverse correlation) to +1.0 (perfect correlation) • The sign (+, -) shows the direction of the relationship • The number shows the strength of the relationship (regardless of sign) • No relationship is 0.0

The formula Note that there are other equivalent formulas also possible.

Assumptions of Pearson correlation • Each pair of scores is independent • Each set of scores is normally distributed • The relationship between scores is linear

Interpreting correlation • Correlation merely shows a relationship between two variables, not the meaning of the relationship • Correlation is not causation • Statistical significance does not imply importance • Statistical significance merely indicates that the correlation strength is greater than one would expect by chance

Statistical significance of r (Cody & Smith, 1997) Imagine a population with a zero correlation Now, sample 10 points from this population The resulting sample would probably have a non-zero correlation

Statistical significance • If a correlation is much larger than what one would expect by chance, it is considered to be significant • Significant does not mean important or strong • Significant merely means that the size of the correlation coefficient is larger than would be expected by a chance sampling from a zero correlation population

Determining significance • Most statistical software packages will automatically flag significant correlations • If checking by hand, compare the r value with the appropriate table • 2-tailed decision at alpha = .05 is common • If the value of r is equal to or larger than the value in the table, the correlation is significant

Decide the level of certainty that you want This is the table from the back of a statistics book Find the number corresponding to your N – 2 Check to see if your correlation coefficient is as large or larger than the one in the table

Coefficient of determination • The coefficient of determination (r2) is a measure of the shared variance between the two variables (Shavelson, 1996)

Potential problems in correlation analysis • restriction of range • correlation of TOEFL, GRE, etc. with grade point average • skewedness • test too easy or too difficult • attribution of causality • variable must be correlation to claim that they are causally related, but correlation alone is not sufficient to prove causality

Point-biserial correlation • Used to correlate a dichotomous variable with a continuous variable • In testing, used to correlate a person’s performance on an item (correct, incorrect) with their total test score • Used as an index of item discrimination

Point-biserial formula IF for item 1 – IF for item Mean on the test for people who got item correct Mean on the test for people who got item incorrect Standard deviation for test

TAP output Number Item Disc. # Correct # Correct Point Adj. Item Key Correct Diff. Index in High Grp in Low Grp Biser. Pt Bis ------- ----- ------- ----- ----- ----------- ----------- ------- ------- Item 01 (2 ) 22 0.44 0.72 14 (0.93) 3 (0.21) 0.64 0.60 Item 02 (4 ) 29 0.58 0.58 13 (0.87) 4 (0.29) 0.51 0.47 Item 03 (4 ) 35 0.70 0.71 15 (1.00) 4 (0.29) 0.52 0.48 Item 04 (3 ) 26 0.52 0.72 14 (0.93) 3 (0.21) 0.63 0.59 Item 05 (2 ) 37 0.74 0.50 15 (1.00) 7 (0.50) 0.38 0.34 Item 06 (1 ) 19 0.38 0.72 13 (0.87) 2 (0.14) 0.59 0.55 Item 07 (3 ) 36 0.72 0.43 14 (0.93) 7 (0.50) 0.34 0.28 Item 08 (4 ) 23 0.46 0.79 15 (1.00) 3 (0.21) 0.63 0.59 Item 09 (4 ) 23 0.46 0.79 14 (0.93) 2 (0.14) 0.61 0.56 Item 10 (4 )# 37 0.74 0.22 13 (0.87) 9 (0.64) 0.18 0.12

Correlation

Correlation

Presentation Transcript

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

CORRELATION

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation