1 / 22

Exploring Correlations: Testing Relationships Between Variables

This lecture covers correlations, testing relationships between two metric variables using Pearson's correlation coefficient and statistical significance. Learn about interpreting correlation, null and alternative hypotheses, factors limiting correlation coefficients, handling outliers, and more.

brudnick
Download Presentation

Exploring Correlations: Testing Relationships Between Variables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 18: Correlations: testing relationships between two metric variables

  2. Agenda • Reminder about Lab 3 • Brief Update on Data for Final • Correlations

  3. Probability Revisited • To make a reasonable decision, we must know: • Probability Distribution • What would the distribution be like if it were only due to chance? • Decision Rule • What criteria do we need in order to determine whether an observation is just due to chance or not.

  4. Quick Recap of An Earlier Issue:Why N-1? • If we have a randomly distributed variable in a population, extreme cases (i.e., the tails) are less likely to be selected than common cases (i.e., within 1 SD of the mean). • One result of this: sample variance is lower than actual population variance. Dividing by n-1 corrects this bias when calculating sample statistics.

  5. Checking for simple linear relationships • Pearson’s correlation coefficient • Measures the extent to which two metric or interval-type variables are linearly related • Statistic is Pearson r, or the linear or product-moment correlation • Or, the correlation coefficient is the average of the cross products of the corresponding z-scores.

  6. Ranges from zero to 1, where 1 = perfect linear relationship between the two variables. Negative relations Positive relations Remember: correlation ONLY measures linear relationships, not all relationships! Correlations

  7. Interpretation • Recall that Correlation is a precondition for causality– but by itself it is not sufficient to show causality (why?) • Correlation is a proportional measure; does not depend on specific measurements • Correlation interpretation: • Direction (+/-) • Magnitude of Effect (-1 to 1); shown as r • Statistical Significance (p<.05, p<.01, p<.001)

  8. Correlation: Null and Alt Hypotheses • Null versus Alternative Hypothesis • H0 • H1, H2, etc • Test Statistics and Significance Level • Test statistic • Calculated from the data • Has a known probability distribution • Significance level • Usually reported as a p-value (probability that a result would occur if the null hypothesis were true). price mpg price 1.0000 mpg -0.4686 1.0000 0.0000

  9. Factors which limit Correlation coefficient • Homogeneity of sample group • Non-linear relationships • Censored or limited scales • Unreliable measurement instrument • Outliers

  10. Homogenous Groups

  11. Homogenous Groups: Adding Groups

  12. Homogenous Groups: Adding More Groups

  13. Separate Groups (non-homogeneous)

  14. Non-Linear Relationships

  15. Censored or Limited Scales…

  16. Censored or Limited Scales

  17. Unreliable Instrument

  18. Unreliable Instrument

  19. Unreliable Instrument

  20. Outliers

  21. Outliers Outlier

  22. Examples with Real Data…

More Related