1 / 52

Statistical Analysis Regression & Correlation

Statistical Analysis Regression & Correlation. Psyc 250 Winter, 2013. Review: Types of Variables & Steps in Analysis. Variables & Statistical Tests. Evaluating an hypothesis. Step 1: What is the relationship in the sample ?

Download Presentation

Statistical Analysis Regression & Correlation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical AnalysisRegression & Correlation Psyc 250 Winter, 2013

  2. Review:Types of Variables&Steps in Analysis

  3. Variables & Statistical Tests

  4. Evaluating an hypothesis • Step 1: What is the relationship in the sample? • Step 2: How confidently can one generalize from the sample to the universe from which it comes? p < .05

  5. Evaluating an hypothesis

  6. Evaluating an hypothesis

  7. Relationships betweenScale Variables Regression Correlation

  8. Regression • Amount that a dependent variable increases (or decreases) for each unit increase in an independent variable. • Expressed as equation for a line – y = m(x) + b – the “regression line” • Interpret by slope of the line: m (Or: interpret by “odds ratio” in “logistic regression”)

  9. Correlation • Strength of association of scale measures • r = -1 to 0 to +1 +1 perfect positive correlation -1 perfect negative correlation 0 no correlation • Interpret r in terms of variance

  10. Mean&Variance

  11. Height Mother’s height Mother’s education SAT Estimate IQ Well-being (7 pt. Likert) Weight Father’s education Family income G.P.A. Health (7 pt. Likert) Example: Weight & HeightSurvey of Class n = 42

  12. Frequency Table for: HEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0

  13. Frequency Table for: HEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for: HEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N HEIGHT 67.33 3.87 14.96 15.00 59.00 74.00 42 mean

  14. Variance  x i - Mean )2 Variance = s2 = ----------------------- N - 1 Standard Deviation = s =  variance

  15. Frequency Table for: WEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for: WEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42 mean

  16. Relationship of weight & height:Regression Analysis

  17. “Least Squares” Regression Line Dependent = ( B ) (Independent) + constant weight = ( B ) ( height ) + constant

  18. Regression line

  19. Regression: WEIGHT on HEIGHT Multiple R .59254 R Square .35110 Adjusted R Square .33488 Standard Error 17.37332 Analysis of Variance DF Sum of Squares Mean Square Regression 1 6532.61322 6532.61322 Residual 40 12073.29154 301.83229 F = 21.64319 Signif F = .0000 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T HEIGHT 3.263587 .701511 .592541 4.652 .0000 (Constant) -73.367236 47.311093 -1.551 [ Equation: Weight = 3.3 ( height ) - 73 ]

  20. Regression line W = 3.3 H - 73

  21. Strength of Relationship“Goodness of Fit”: Correlation How well does the regression line “fit” the data?

  22. Correlation • Strength of association of scale measures • r = -1 to 0 to +1 +1 perfect positive correlation -1 perfect negative correlation 0 no correlation • Interpret r in terms of variance

  23. Frequency Table for: WEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for: WEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42 mean

  24. mean Variance = 454

  25. Regression line mean

  26. Correlation: “Goodness of Fit” • Variance (average sum of squared distances from mean) = 454 • “Least squares” (average sum of squared distances from regression line) = 295

  27. l.s. = 295 Regression line S2 = 454 mean

  28. Correlation: “Goodness of Fit” How much is variance reduced by calculating from regression line? 454 – 295 = 159 159 / 454 = .35 Variance is reduced 35% by calculating “least squares” from regression line r2 = .35

  29. Correlation coefficient = r r2 = % of variance in WEIGHT “explained” by HEIGHT

  30. Correlation: HEIGHT with WEIGHT HEIGHT WEIGHT HEIGHT 1.0000 .5925 ( 42) ( 42) P= . P= .000 WEIGHT .5925 1.0000 ( 42) ( 42) P= .000 P= .

  31. r = .59 r2 = .35 HEIGHT “explains” 35% of variance in WEIGHT

  32. Sentence & G.P.A. • Regression: form of relationship • Correlation: strength of relationship • p value: statistical significance

  33. Legal Attitudes Study: • Relationship of sentence length to G.P.A.? • Relationship of sentence length to Liberal-Conservative views

  34. G. P. A.

  35. Length of Sentence (simulated data)

  36. Scatterplot: Sentence on G.P.A.

  37. Regression Coefficients Sentence = -3.5 G.P.A. + 18

  38. “Least Squares” Regression Line Sent = -3.5 GPA + 18

  39. Correlation: Sentence & G.P.A.

  40. Statistical Significance Regression: Correlation p = .31

  41. Interpreting Correlations • r = -.22 • r2 = .05 p = .31 G.P.A. “explains” 5% of the variance in length of sentence

  42. Write Results “A regression analysis finds that each higher unit of GPA is associated with a 3.5 month decrease in sentence length, but this correlation was low (r = -.22) and not statistically significant (p = .31).”

  43. Multiple Regression • Problem: relationship of weight and calorie consumption • Both weight and calorie consumption related to height • Need to “control for” height or assess relative effects of height and calorie consumption

  44. Multiple Regression Regression line mean

  45. Multiple Regression Regression line Residuals mean

  46. Multiple Regression • Regress weight residuals (dependent variable) on caloric intake (independent variable) • Statistically “controls” for height: removes effect or “confound” of height . • How much variance in weight does caloric intake account for over and above height?

  47. Multiple Regression • How much variance in dependent measure (weight, length of sentence) do all independent variables combined account for?  multiple R2 • What is the best “model” for predicting the dependent variable?

More Related