1 / 42

Correlation and Prediction

Correlation and Prediction. Chapter 3. Chapter Outline. Graphing Correlations: The Scatter Diagram Patterns of Correlation The Correlation Coefficient Issues in Interpreting the Correlation Coefficient Prediction The Correlation Coefficient and Proportion of Variance Accounted for

barbarae
Download Presentation

Correlation and Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation and Prediction Chapter 3

  2. Chapter Outline • Graphing Correlations: The Scatter Diagram • Patterns of Correlation • The Correlation Coefficient • Issues in Interpreting the Correlation Coefficient • Prediction • The Correlation Coefficient and Proportion of Variance Accounted for • Correlation and Prediction in Research Articles • Advanced Topic: Multiple Regression • Advanced Topic: Multiple Regression in Research Articles

  3. Correlations • Can be thought of as a descriptive statistic for the relationship between two variables • Describes the relationship between two equal-interval numeric variables • e.g., correlation between amount of time studying and amount learned • e.g., correlation between number of years of education and salary

  4. Correlation instruct.uwo.ca/geog/500/correlation_by_6.pdf

  5. Scatter Diagram or Scatter Plot Graph showing the pattern o f the relationship between two variables

  6. Patterns of Correlation • A linear correlation • relationship between two variables on a scatter diagram roughly approximating a straight line • Curvilinear correlation • any association between two variables other than a linear correlation • relationship between two variables that shows up on a scatter diagram as dots following a systematic pattern that is not a straight line • No correlation • no systematic relationship between two variables

  7. Positive and Negative Linear Correlation • Positive Correlation • High scores go with high scores. • Low scores go with low scores. • Medium scores go with medium scores. • e.g., level of education achieved and income • Negative Correlation • High scores go with low scores. • e.g., the relationship between fewer hours of sleep and higher levels of stress • Strength of the Correlation • how close the dots on a scatter diagram fall to a simple straight line

  8. Positive Linear correlation

  9. Negative correlation

  10. Zero Correlation ludwig-sun2.unil.ch/~darlene/Rmini/lec/20021031.ppt

  11. Curvilinear Relationship ludwig-sun2.unil.ch/~darlene/Rmini/lec/20021031.ppt

  12. Curvilinear

  13. How Are You doing? • What does it mean when two variables have a curvilinear relationship? • True or False: When two variables are negatively correlated, high scores go with high scores, low scores go with low scores, and medium scores go with medium scores.

  14. The Correlation Coefficient • Number that gives exact correlation between 2 variables • can tell you direction and strength • uses Z scores to compare scores on different variables • Z scores allow you to calculate a cross-product that tells you the direction of the correlation. • A cross-product is the result of multiplying a score on one variable by a score on the other variable. • If you multiply a high Z score by a high Z score, you will always get a positive cross-product. • If you multiply a low Z score by a low Z score, you will always get a positive cross-product. • If you multiply a high Z score with a low Z score or a low Z score with a high Z score, you will get a negative number.

  15. The Correlation Coefficient ( r ) • The sign of r (Pearson correlation coefficient) tells the general trend of a relationship between two variables. • A + sign means the correlation is positive. • A - sign means the correlation is negative. • The value of r ranges from 0 to 1. • 1 is the highest value a correlation can have. • A correlation of 1 or -1 means that the variables are perfectly correlated. • 0 = no correlation • The value of a correlation defines the strength of the correlation regardless of the sign. • e.g., -.99 is a stronger correlation than .75

  16. Formula for a Correlation Coefficient • r = ∑ZxZy N • Zx = Z score for each person on the X variable • Zy = Z score for each person on the Y variable • ZxZy = cross-product of Zx and Zy • ∑ZxZy = sum of the cross-products of the Z scores over all participants in the study

  17. Pearson Correlation Coefficient • Pearson correlation coefficient“r” is the average value of the cross-product of ZX and Zy • r is a measure of LINEAR ASSOCIATION (Direction: + vs. – and Strength: How much

  18. Definitional Formula

  19. Computational Formula

  20. Bivariate Correlation

  21. Issues in Interpreting the Correlation Coefficient • Direction of causality • path of causal effect (e.g., X causes Y) • You cannot determine the direction of causality just because two variables are correlated.

  22. Three Possible Directions of Causality • Variable X causes variable Y. • e.g., less sleep causes more stress • Variable Y causes variable X. • e.g., more stress causes people to sleep less • There is a third variable that causes both variable X and variable Y. • e.g., working longer hours causes both stress and fewer hours of sleep

  23. Ruling Out Some Possible Directions of Causality • Longitudinal Study • a study where people are measured at two or more points in time • e.g., evaluating number of hours of sleep at one time point and then evaluating their levels of stress at a later time point • True Experiment • a study in which participants are randomly assigned to a particular level of a variable and then measured on another variable • e.g., exposing individuals to varying amounts of sleep in a laboratory environment and then evaluating their stress levels

  24. The Statistical Significance of r • A correlation is statistically significant if it is unlikely that you could have gotten a correlation as big as you did if in fact there was no relationship between variables. • If the probability (p) is less than some small degree of probability (e.g., 5% or 1%), the correlation is considered statistically significant.

  25. Malawi Med J. 2012 Sep; 24(3): 69–71.

  26. Key Points • Two variables are correlated when they are associated in a clear pattern. • A scatter diagram displays the relationship between two variables. • A linear correlation is seen when the dots in a scatter diagram generally follow a straight line. In a curvilinear correlation, the dots follow a pattern that does not approximate a straight line. When there is no correlation, the dots do not follow a pattern. • In a positive correlation, the highs go with the highs, the lows with the lows, and the mediums go with the mediums. With a negative correlation, the lows go with the highs. • r is the correlation coefficient and gives you the direction and strength of a correlation. r = (∑Zx Zy )/N • The maximum positive value of r = 1 and the maximum negative value of r = -1. • The closer the correlation is to -1 or 1, the stronger the correlation. • Correlation does not tell you the direction of causation. • Prediction model using Z scores = predicted Zy = ()(Zx). • Prediction model with raw scores = predicted Y = (SDy)(predicted Zy) + My. • r2 = proportion of variance accounted for and is used to compare linear correlations • Correlation coefficients are reported both in the text and in tables of research articles.

More Related