1 / 39

Correlation and Regression

Correlation and Regression. Quantitative Methods in HPELS 440:210. Agenda. Introduction The Pearson Correlation Hypothesis Tests with the Pearson Correlation Regression Instat Nonparametric versions. Introduction.

devlin
Download Presentation

Correlation and Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation and Regression Quantitative Methods in HPELS 440:210

  2. Agenda • Introduction • The Pearson Correlation • Hypothesis Tests with the Pearson Correlation • Regression • Instat • Nonparametric versions

  3. Introduction • Correlation: Statistical technique used to measure and describe a relationship between two variables • Direction of relationship: • Positive • Negative • Form of relationship: • Linear • Quadratic . . . • Degree of relationship: • -1.0  0.0  +1.0

  4. Uses of Correlations • Prediction • Validity • Reliability

  5. Agenda • Introduction • The Pearson Correlation • Hypothesis Tests with the Pearson Correlation • Regression • Instat • Nonparametric versions

  6. The Pearson Correlation • Statistical Notation  Recall for ANOVA: • r = Pearson correlation • SP = sum of products of deviations • Mx = mean of x scores • SSx = sum of squares of x scores

  7. Pearson Correlation • Formula Considerations  Recall for ANOVA: • SP = S(X – Mx)(Y – My) • SP = SXY – SXSY / n • SSx = S(X – Mx)2 • SSy = S(Y – My)2 • r = SP / √SSxSSy

  8. Pearson Correlation • Step 1: Calculate SP • Step 2: Calculate SS for X and Y values • Step 3: Calcuate r

  9. Step 1  SP SXY = (0*1)+(10*3)+(4*1)+(8*2)+(8*3) SXY = 0 + 30 + 4 + 16 + 24 SXY = 74 SP = SXY – SXSY / n SP = 74 – [30(100)]/5 SP = 74 - 60 SP = 14 SP = S(X – Mx)(Y – My) SP = (-6*-1)+(4*1)+(-2*-1)+(2*0)+(2*1) SP = 6 + 4 + 2 + 0 + 2 SP = 14 SX=30 SY=10

  10. Step 2  SSx and SSy

  11. Step 3  r • r = SP / √SSxSSy • r = 14 / √(64)(4) • r = 14 / √256 • r = 14/16 • r = 0.875

  12. Interpretation of r • Correlation ≠ causality • Restricted range • If data does not represent the full range of scores – be wary • Outliers can have a dramatic effect • Figure 16.9 • Correlation and variability • Coefficient of determination (r2)

  13. Agenda • Introduction • The Pearson Correlation • Hypothesis Tests with the Pearson Correlation • Regression • Instat • Nonparametric versions

  14. The Process • Step 1: State hypotheses • Non directional: • H0: ρ = 0 (no population correlation) • H1: ρ ≠ 0 (population correlation exists) • Directional: • H0: ρ ≤ 0 (no positive population correlation) • H1: ρ < 0 (positive population correlation exists) • Step 2: Set criteria • a = 0.05 • Step 3: Collect data and calculate statistic • r • Step 4: Make decision • Accept or reject

  15. Example • Researchers are interested in determining if leg strength is related to jumping ability • Researchers measure leg strength with 1RM squat (lbs) and vertical jump height (inches) in 5 subjects (n = 5)

  16. Step 1: State Hypotheses Non-Directional H0: ρ = 0 H1: ρ ≠ 0 Critical value = 0.878 Step 2: Set Criteria Alpha (a) = 0.05 Critical Value: Use Critical Values for Pearson Correlation Table Appendix B.6 (p 697) 0.878 Information Needed: df = n - 2 Alpha (a) = 0.05 Directional or non-directional?

  17. Step 3: Collect Data and Calculate Statistic Data: Calculate SP SP = SXY – SXSY / n SP = 27135 – [1065(126)]/5 SP = 27135 - 26838 SP = 297 Calculate SSx S M S

  18. Step 3: Collect Data and Calculate Statistic Calculate SSy M S M S Step 4: Make Decision 0.667 < 0.878 Accept or reject? Calculate r r = SP / √SSxSSy r = 297 / √11780(16.8) r = 297 / √197904 r = 297 / 444.86 r = 0.667

  19. Agenda • Introduction • The Pearson Correlation • Hypothesis Tests with the Pearson Correlation • Regression • Instat • Nonparametric versions

  20. Regression • Recall  Several uses of correlation: • Prediction • Validity • Reliability • Regression attempts to predict one variable based on information about the other variable • Line of best fit

  21. Regression • Line of best fit can be described with the following linear equation  Y = bX + a where: • Y = predicted Y value • b = slope of line • X = any X value • a = intercept

  22. 25 5 Y = bX + a, where: Y = cost (?) b = cost per hour ($5) X = number of hours (?) a = membership cost ($25) Y = 5X + 25 Y = 5(10) + 25 Y = 50 + 25 = 75 Y = 5X + 25 Y = 5(30) + 25 Y = 150 + 25 = 175

  23. Line of best fit minimizes distances of points from line

  24. Calculation of the Regression Line • Regression line = line of best fit = linear equation • SP = S(X – Mx)(Y – My) • SSx = S(X – Mx)2 • b = SP / SSx • a = My - bMx

  25. Example 16.14, p 557 Mx=5 My=6 SP = S(X – Mx)(Y – My) SP = 16 SSx = S(X – Mx)2 SP = 10 b = SP / SSx b = 16 / 10 = 1.6 a = My - bMx a = 6 – 1.6(5) = -2 Y = bX + a Y = 1.6(X) - 2

  26. Agenda • Introduction • The Pearson Correlation • Hypothesis Tests with the Pearson Correlation • Regression • Instat • Nonparametric versions

  27. Instat - Correlation • Type data from sample into a column. • Label column appropriately. • Choose “Manage” • Choose “Column Properties” • Choose “Name” • Choose “Statistics” • Choose “Regression” • Choose “Correlation”

  28. Instat – Correlation • Choose the appropriate variables to be correlated • Click OK • Interpret the p-value

  29. Instat – Regression • Type data from sample into a column. • Label column appropriately. • Choose “Manage” • Choose “Column Properties” • Choose “Name” • Choose “Statistics” • Choose “Regression” • Choose “Simple”

  30. Instat – Regression • Choose appropriate variables for: • Response (Y) • Explanatory (X) • Check “significance test” • Check “ANOVA table” • Check “Plots” • Click OK • Interpret p-value

  31. Reporting Correlation Results • Information to include: • Value of the r statistic • Sample size • p-value • Examples: • A correlation of the data revealed that strength and jumping ability were not significantly related (r = 0.667, n = 5, p > 0.05) • Correlation matrices are used when interrelationships of several variables are tested (Table 1, p 541)

  32. Agenda • Introduction • The Pearson Correlation • Hypothesis Tests with the Pearson Correlation • Regression • Instat • Nonparametric versions

  33. Nonparametric Versions • Spearman rho  when at least one of the data sets is ordinal • Point biserial correlation  when one set of data is ratio/interval and the other is dichotomous • Male vs. female • Success vs. failure • Phi coefficient  when both data sets are dichotomous

  34. Violation of Assumptions • Nonparametric Version  Friedman Test (Not covered) • When to use the Friedman Test: • Related-samples design with three or more groups • Scale of measurement assumption violation: • Ordinal data • Normality assumption violation: • Regardless of scale of measurement

  35. Textbook Assignment • Problems: 5, 7, 10, 23 (with post hoc)

More Related