1 / 61

My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

Please click in. Homework due next class - April 19 th. ANOVA Project. My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z. Homework due April 21 st. Correlation and Regression Using Excel. Be sure that your Class ID is on your homework.

billy
Download Presentation

My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Please click in Homework due next class - April 19th • ANOVA Project My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z Homework due April 21st • Correlation and Regression Using Excel Be sure that your Class ID is on your homework Hand in your homework Please double check – All cell phones other electronic devices are turned off and stowed away Turn your clicker on

  2. MGMT 276: Statistical Inference in Management Welcome Please double check – All cell phones other electronic devices are turned off and stowed away http://www.thedailyshow.com/video/index.jhtml?videoId=188474&title=an-arab-family-man

  3. Use this as your study guide By the end of lecture today4/14/11 Logic of hypothesis testing with correlations Interpreting correlations and scatterplots Conducting tests of significance for correlational data Introduction to Regression A note regarding empowerment

  4. Readings for next exam Lind Chapter 13: Linear Regression and Correlation Chapter 14: Multiple Regression Chapter 15: Chi-Square Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

  5. Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses For correlation null is that r = 0 (no relationship) Step 2: Decision rule • Alpha level? (α= .05 or .01)? • Critical statistic (e.g. critical r) value from table? Step 3: Calculations MSBetween F = MSWithin Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger then critical r then reject null Step 5: Conclusion - tie findings back in to research problem

  6. Finding a statistically significant correlation • The result is “statistically significant” if: • the observed correlation is larger than the critical correlationwe want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) • the p value is less than 0.05 (which is our alpha) • we want our “p” to be small!! • we reject the null hypothesis • then we have support for our alternative hypothesis

  7. Correlation Correlation: Measure of how two variables co-occur and also can be used for prediction • Range between -1 and +1

  8. Correlation • The closer to zero the weaker the relationship and the worse the prediction • Positive or negative

  9. Positive correlation • Positive correlation: • as values on one variable go up, so do values for other variable • pairs of observations tend to occupy similar relative positions • higher scores on one variable tend to co-occur with higher scores on the second variable • lower scores on one variable tend to co-occur with lower scores on the second variable • scatterplot shows clusters of point • from lower left to upper right

  10. Negative correlation • Negative correlation: • as values on one variable go up, values for other variable go down • pairs of observations tend to occupy dissimilar relative positions • higher scores on one variable tend to co-occur with lower scores on • the second variable • lower scores on one variable tend to • co-occur with higher scores on the • second variable • scatterplot shows clusters of point • from upper left to lower right

  11. Zero correlation • as values on one variable go up, values for the other variable • go... anywhere • pairs of observations tend to occupy seemingly random • relative positions • scatterplot shows no apparent slope

  12. The more closely the dots approximate a straight line, the stronger the relationship is. Correlation • Perfect correlation = +1.00 or -1.00 • One variable perfectly predicts the other • No variability in the scatter plot • The dots approximate a straight line

  13. Correlation - How do we calculate the exact r? Computational formula for correlation - abbreviated by r Pearson correlation coefficient (r): A number between -1.00 and =1.00 that describes the linear relationship between pairs of quantitative variables The formula:

  14. Correlation - How do we calculate the exact r? We want to know the relationship between math ability and spelling ability. We gave 5 people a 20-point math test and a 20-point spelling test. . . . Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800

  15. Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800 . First let’s draw a scatter plot

  16. Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800 Correlation - Let’s do one Step 1: Find n n = 5 (5 pairs) Step 2: Find ΣX and ΣY Step 3: Find ΣXY Step 4: Find ΣX2 and ΣY2 Step 5: Plug in the numbers The formula:

  17. Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800 r = r = r = (320) [√[(1625)-(1225)] [√[(4000)-(3600)] [√[(5)(325)-(35)2] [√[(5)(800)-(60)2] 320 = [√400] [√400] 400 Step 5: Plug in the numbers The formula: (5)(484)-(35)(60) (2420)-(2100) r = .80

  18. Make decision whether the correlation is different from zero α= 0.05 df = 3 Observed r(3) = 0.80 Critical r(3) = 0.878 Conclusion: r = 0.80 is not bigger than a r = .878 so not a significant r (not significantly different than zero – nothing going on) r(3) = 0.80; n.s.

  19. Observed r(3) = 0.80 r(3) = 0.80; n.s. Critical r(3) = 0.878 Conclusion: r = 0.80 is not bigger than a r = .878 so not a significant r (not significantly different than zero – nothing going on) These data suggest a strong positive correlation between math ability and spelling ability, however this correlation was not large enough to reach significance, r(3) = 0.80; n.s.

  20. What if we ran more subjects?

  21. Correlation - How do we calculate the exact r? Computational formula for correlation - abbreviated by r Pearson correlation coefficient (r): A number between -1.00 and =1.00 that describes the linear relationship between pairs of quantitative variables The formula:

  22. Correlation - How do we calculate the exact r? We want to know the relationship between math ability and spelling ability. We gave 50 people a 20-point math test and a 20-point spelling test. Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 : : :::: RG 1 6 6 1 36 Σ350600484032508000 The same data were copied 10 times to highlight power of larger samples What if we ran more subjects?

  23. Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 :: : : : : RG 1 6 6 1 36 Σ350600484032508000 Correlation - Let’s do one Step 1: Find n n = 50 (50 pairs) Step 2: Find ΣX and ΣY Step 3: Find ΣXY Step 4: Find ΣX2 and ΣY2 Step 5: Plug in the numbers The formula:

  24. Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD ::::: RG 1 6 6 1 36 Σ350600484032508000 r = [√[(50)(3250)-(350)2] [√[(50)(8000)-(600)2] 3200 r = 4000 Step 5: Plug in the numbers The formula: (50)(4840)-(350)(600) r = .80

  25. α= 0.05 df = 48 Observed r(48) = 0.80 Critical r(48)= 0.288 r(48) = 0.80; p < 0.05. What if we had run more participants??

  26. Conclusion: r = 0.80 is bigger than a r = .273 so there is a significant r (yes significantly different than zero – something going on) Observed r(48) = 0.80 Critical r(48)= 0.273 r(48) = 0.80; p < 0.05. These data suggest a strong positive correlation between math ability and spelling ability, and this correlation was large enough to reach significance, r(48) = 0.80; p < 0.05

  27. Education Age IQ Income 0.38* Education -0.02 0.52* Age 0.38* -0.02 0.27* IQ 0.52* Income 0.27* Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables 1.0** 0.41* 0.65** 0.41* 1.0** 1.0** 0.65** 1.0** * p < 0.05 ** p < 0.01

  28. Education Age IQ Income Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables Education Age IQ Income 0.41* 0.38* 0.65** -0.02 0.52* 0.27* * p < 0.05 ** p < 0.01

  29. Finding a statistically significant correlation • The result is “statistically significant” if: • the observed correlation is larger than the critical correlationwe want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) • the p value is less than 0.05 (which is our alpha) • we want our “p” to be small!! • we reject the null hypothesis • then we have support for our alternative hypothesis

  30. Correlation matrices • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of X with X Correlation of Y with Y Correlation of Z with Z

  31. Correlation matrices Does this correlation reach statistical significance? • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of X with Y Correlation of X with Y p value for correlation of X with Y p value for correlation of X with Y

  32. Correlation matrices Does this correlation reach statistical significance? • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of X with Z Correlation of X with Z p value for correlation of X with Z p value for correlation of X with Z

  33. Correlation matrices Does this correlation reach statistical significance? • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of Y with Z Correlation of Y with Z p value for correlation of Y with Z p value for correlation of Y with Z

  34. Correlation matrices What do we care about?

  35. Correlation: Independent and dependent variables • When used for prediction we refer to the predicted variable • as the dependent variable and the predictor variable as the independent variable What are we predicting? What are we predicting? Dependent Variable Dependent Variable Independent Variable Independent Variable

  36. What are we predicting? Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Yearly income by expenses per year YearlyIncome Positive Correlation Expenses per year

  37. What are we predicting? Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Temperatures by time spent outside in Tucson in summer Temperature Negative Correlation Timeoutside

  38. What are we predicting? Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Height by average driving speed Height Zero Correlation Average Speed

  39. What are we predicting? Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Amount Healthtex spends per month on advertising by sales in the month Amountof sales Positive Correlation Amount spent On Advertising

  40. YearlyIncome Expenses per year Correlation - What do we need to define a line If you probably make this much Y-intercept = “a” (also “b0”)Where the line crosses the Y axis Slope = “b” (also “b1”)How steep the line is If you spend this much • The predicted variable goes on the “Y” axis and is called the dependent variable • The predictor variable goes on the “X” axis and is called the independent variable

  41. Correlation - the prediction line - what is it good for? Prediction line • makes the relationship easier to see • (even if specific observations - dots - are removed) • identifies the center of the cluster of (paired) observations • identifies the central tendency of the relationship(kind of like a mean) • can be used for prediction • should be drawn to provide a “best fit” for the data • should be drawn to provide maximum predictive power for the data • should be drawn to provide minimum predictive error

  42. Yearly Income Yearly Income YearlyIncome Expenses per year Expenses per year Expenses per year Correlation - What do we need to define a line Y-intercept = “a”Where the line crosses the Y axis Slope = “b” How steep the line is Y-intercept is good…slope is wrong Y-intercept is wrong…slope is good

  43. 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities Correlation - let’s do another one Does brushing your teeth correlate with fewer cavities? Step 1: Draw scatterplot Step 2: Data table X Y XY X2 Y2 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Step 3: Estimate r and prediction line Step 4: Find r

  44. Correlation - Let’s do one Step 1: Find n n = 5 (5 pairs) Step 2: Find ΣX and ΣY Step 3: Find ΣXY Step 4: Find ΣX2 and ΣY2 Step 5: Plug in the numbers X Y XY X2 Y2 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 The formula:

  45. r = r = [√[(5)(55)-(15)2] [√[(5)(48)-(14)2] - 40 (170 - 210) = [√50 ] [√44 ] 46.90 Correlation - Let’s do one Step 1: Find n n = 5 (5 pairs) Step 2: Find ΣX and ΣY Step 3: Find ΣXY Step 4: Find ΣX2 and ΣY2 Step 5: Plug in the numbers (5)(34)-(14)(15) X Y XY X2 Y2 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 r = -.85 The formula:

  46. X Y XY X2 Y2 . 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Find r r = -0.85

  47. X Y XY X2 Y2 . 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Draw a scatterplot

  48. X Y XY X2 Y2 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Draw a scatterplot

  49. Draw a regression line

More Related