1 / 55

Overview

Overview. Correlation -Definition -Deviation Score Formula, Z score formula -Hypothesis Test Regression Intercept and Slope Unstandardized Regression Line Standardized Regression Line Hypothesis Tests. Associations among Continuous Variables. 3 characteristics of a relationship.

zach
Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview • Correlation • -Definition • -Deviation Score Formula, Z score formula • -Hypothesis Test • Regression • Intercept and Slope • Unstandardized Regression Line • Standardized Regression Line • Hypothesis Tests

  2. Associations among Continuous Variables

  3. 3 characteristics of a relationship • Direction • Positive(+) • Negative (-) • Degree of association • Between –1 and 1 • Absolute values signify strength • Form • Linear • Non-linear

  4. Direction Positive Negative • Large values of X = large values of Y, • Small values of X = small values of Y. • - e.g. IQ and SAT • Large values of X = small values of Y • Small values of X = large values of Y • -e.g. SPEED and ACCURACY

  5. Degree of association Strong(tight cloud) Weak(diffuse cloud)

  6. Form Linear Non- linear

  7. Regression & Correlation

  8. What is the best fitting straight line? Regression Equation: Y = a + bX How closely are the points clustered around the line? Pearson’s R

  9. Correlation

  10. Dataset Obs X Y A 1 1 B 1 3 C 3 2 D 4 5 E 6 4 F 7 5 Correlation - Definition Correlation: a statistical technique that measures and describes the degree of linear relationship between two variables Scatterplot Y X

  11. Pearson’s r A value ranging from -1.00 to 1.00 indicating the strength and direction of the linear relationship. Absolute value indicates strength +/- indicates direction

  12. Cross-Product = The Logic of Correlation MEAN of X MEAN of Y For a strong positive association, the cross-products will mostly be positive

  13. The Logic of Correlation MEAN of X MEAN of Y For a strong negative association, the cross-products will mostly be negative Cross-Product =

  14. The Logic of Correlation MEAN of X MEAN of Y For a weak association, the cross-products will be mixed Cross-Product =

  15. Deviation score formula SP (sum of products) = ∑ (X – X)(Y – Y) Pearson’s r

  16. Deviation Score Formula

  17. Deviation Score Formula = .99

  18. Deviation score formula SP (sum of products) = ∑ (X – X)(Y – Y) Pearson’s r For a strong positive association, the SP will be a big positive number

  19. ∑ (X – X)(Y – Y) Pearson’s r Deviation score formula For a strong negative association, the SP will be a big negative number SP (sum of products) =

  20. ∑ (X – X)(Y – Y) Pearson’s r Deviation score formula For a weak association, the SP will be a small number (+ and – will cancel each other out) SP (sum of products) =

  21. Z score formula Standardized cross-products Pearson’s r

  22. Z-score formula

  23. Z-score formula

  24. Z-score formula r = .99

  25. Formulas for R Z score formula Deviations formula

  26. Interpretation of R • A measure of strength of association: how closely do the points cluster around a line? • A measure of the direction of association: is it positive or negative?

  27. Interpretation of R • r = .10 very small association, not usually reliable • r = .20 small association • r = .30 typical size for personality and social studies • r = .40 moderate association • r = .60 you are a research rock star • r = .80 hmm, are you for real?

  28. Interpretation of R-squared • The amount of covariation compared to the amount of total variation • “The percent of total variance that is shared variance” • E.g. “If r = .80, then X explains 64% of the variability in Y” (and vice versa)

  29. Hypothesis testing with r Hypotheses H0:  = 0 HA :  ≠ 0 Test statistic = r Or just use table E.2 to find critical values of r

  30. Practice

  31. Practice

  32. Properties of R • A standardized statistic – will not change if you change the units of X or Y. (bc based on z-scores) • The same whether X is correlated with Y or vice versa • Fairly unstable with small n • Vulnerable to outliers • Has a skewed distribution

  33. Linear Regression

  34. Linear Regression • But how do we describe the line? • If two variables are linearly related it is possible to develop a simple equation to predict one variable from the other • The outcome variable is designated the Y variable, and the predictor variable is designated the X variable • E.g. centigrade to Fahrenheit: F = 32 + 1.8C this formula gives a specific straight line

  35. The Linear Equation • F = 32 + 1.8(C) • General form is Y = a + bX • The prediction equation: Y’ = a+ bX • Where • a = intercept • b = slope • X = the predictor • Y = the criterion a and b are constants in a given line; X and Y change

  36. The Linear Equation • F = 32 + 1.8(C) • General form is Y = a + bX • The prediction equation: Y’ = a + bX • Where • a = intercept • b = slope • X = the predictor • Y = the criterion Different b’s…

  37. The Linear Equation • F = 32 + 1.8(C) • General form is Y = a + bX • The prediction equation: Y’ = a + bX • Where • a = intercept • b = slope • X = the predictor • Y = the criterion Different a’s…

  38. The Linear Equation • F = 32 + 1.8(C) • General form is Y = a + bX • The prediction equation: Y’ = a + bX • Where • a = intercept • b = slope • X = the predictor • Y = the criterion Different a’s and b’s …

  39. Slope and Intercept • Equation of the line • The slope b: the amount of change in y with one unit change in x • The intercept a: the value of y when x is zero

  40. Slope and Intercept • Equation of the line • The slope • The intercept The slope is influenced by r, but is not the same as r

  41. When there is no linear association (r = 0), the regression line is horizontal. b=0. and our best estimate of age is 29.5 at all heights.

  42. When the correlation is perfect (r = ± 1.00), all the points fall along a straight line with a slope

  43. When there is some linear association (0<|r|<1), the regression line fits as close to the points as possible and has a slope

  44. Where did this line come from? • It is a straight line which is drawn through a scatterplot, to summarize the relationship between X and Y • It is the line that minimizes the squared deviations (Y’ – Y)2 • We call these vertical deviations “residuals”

  45. Regression lines Minimizing the squared vertical distances, or “residuals”

  46. Unstandardized Regression Line • Equation of the line • The slope • The intercept

  47. Properties of b (slope) • An unstandardized statistic – will change if you change the units of X or Y. • Depends on whether Y is regressed on X or vice versa

  48. Standardized Regression Line • Equation of the line • The slope • The intercept A person 1 stdev above the mean on height would be how many stdevs above the mean on weight?

  49. Properties of β (standardized slope) • A standardized statistic – will not change if you change the units of X or Y. • Is equal to r, in simple linear regression

More Related