1 / 18

Correlation

Correlation. Overview and interpretation. Making a Scatterplot. Line up the data in columns (eliminate missing data). Plot the student’s score on each variable. Bill.

jed
Download Presentation

Correlation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation Overview and interpretation

  2. Making a Scatterplot Line up the data in columns (eliminate missing data) Plot the student’s score on each variable Bill Adapted from Wiersma, W., & Jurs, S. G. (1990). Educational measurement and testing (2nd ed.). Needham Heights, MA: Allyn and Bacon.

  3. Inspect the scatterplot • The correlation coefficient (Pearson r) can only be interpreted for linear relationships These are all examples of linear relationships The strength of the correlations vary Shavelson, R. J. (1996). Statistical reasoning for the behavioral sciences (Third ed.). Needham Heights, MA: Allyn & Bacon.

  4. Inspect the scatterplot (2) • If you see these types of distributions, you are dealing with a curvilinear relationship

  5. outlier Inspect the scatterplot (3) • Students who seem to be ‘out on their own’ in the scatter plot are called outliers • Including outliers in the calculation can change the relationship

  6. Pearson r correlation coefficient • Range from -1.0 (perfect inverse correlation) to +1.0 (perfect correlation) • The sign (+, -) shows the direction of the relationship • The number shows the strength of the relationship (regardless of sign) • No relationship is 0.0

  7. The formula Note that there are other equivalent formulas also possible.

  8. Assumptions of Pearson correlation • Each pair of scores is independent • Each set of scores is normally distributed • The relationship between scores is linear

  9. Interpreting correlation • Correlation merely shows a relationship between two variables, not the meaning of the relationship • Correlation is not causation • Statistical significance does not imply importance • Statistical significance merely indicates that the correlation strength is greater than one would expect by chance

  10. Statistical significance of r (Cody & Smith, 1997) Imagine a population with a zero correlation Now, sample 10 points from this population The resulting sample would probably have a non-zero correlation

  11. Statistical significance • If a correlation is much larger than what one would expect by chance, it is considered to be significant • Significant does not mean important or strong • Significant merely means that the size of the correlation coefficient is larger than would be expected by a chance sampling from a zero correlation population

  12. Determining significance • Most statistical software packages will automatically flag significant correlations • If checking by hand, compare the r value with the appropriate table • 2-tailed decision at alpha = .05 is common • If the value of r is equal to or larger than the value in the table, the correlation is significant

  13. Decide the level of certainty that you want This is the table from the back of a statistics book Find the number corresponding to your N – 2 Check to see if your correlation coefficient is as large or larger than the one in the table

  14. Coefficient of determination • The coefficient of determination (r2) is a measure of the shared variance between the two variables (Shavelson, 1996)

  15. Potential problems in correlation analysis • restriction of range • correlation of TOEFL, GRE, etc. with grade point average • skewedness • test too easy or too difficult • attribution of causality • variable must be correlation to claim that they are causally related, but correlation alone is not sufficient to prove causality

  16. Point-biserial correlation • Used to correlate a dichotomous variable with a continuous variable • In testing, used to correlate a person’s performance on an item (correct, incorrect) with their total test score • Used as an index of item discrimination

  17. Point-biserial formula IF for item 1 – IF for item Mean on the test for people who got item correct Mean on the test for people who got item incorrect Standard deviation for test

  18. TAP output Number Item Disc. # Correct # Correct Point Adj. Item Key Correct Diff. Index in High Grp in Low Grp Biser. Pt Bis ------- ----- ------- ----- ----- ----------- ----------- ------- ------- Item 01 (2 ) 22 0.44 0.72 14 (0.93) 3 (0.21) 0.64 0.60 Item 02 (4 ) 29 0.58 0.58 13 (0.87) 4 (0.29) 0.51 0.47 Item 03 (4 ) 35 0.70 0.71 15 (1.00) 4 (0.29) 0.52 0.48 Item 04 (3 ) 26 0.52 0.72 14 (0.93) 3 (0.21) 0.63 0.59 Item 05 (2 ) 37 0.74 0.50 15 (1.00) 7 (0.50) 0.38 0.34 Item 06 (1 ) 19 0.38 0.72 13 (0.87) 2 (0.14) 0.59 0.55 Item 07 (3 ) 36 0.72 0.43 14 (0.93) 7 (0.50) 0.34 0.28 Item 08 (4 ) 23 0.46 0.79 15 (1.00) 3 (0.21) 0.63 0.59 Item 09 (4 ) 23 0.46 0.79 14 (0.93) 2 (0.14) 0.61 0.56 Item 10 (4 )# 37 0.74 0.22 13 (0.87) 9 (0.64) 0.18 0.12

More Related