440 likes | 512 Views
Please click in. Set your clicker to channel 41. My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z.
E N D
Please click in Set your clicker to channel 41 My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z
Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, SOC200Lecture Section 001, Fall, 2011Room 201 Physics-Atmospheric Sciences (PAS)10:00 - 10:50 Mondays & Wednesdays + Lab Session Welcome Please double check – All cell phones other electronic devices are turned off and stowed away http://www.youtube.com/watch?v=oSQJP40PcGI
Use this as your study guide By the end of lecture today11/14/11 Logic of hypothesis testing with Correlations Interpreting the Correlations and scatterplots Simple Regression Using correlation for predictions r versus r2
Homework due next class November 16th Assignment 12: Correlation worksheet (can be found on class website) Be sure that your Class ID is on your homework Hand in your homework Please double check – All cell phones other electronic devices are turned off and stowed away
Readings for next exam Lind Chapter 13: Linear Regression and Correlation Chapter 14: Multiple Regression Chapter 15: Chi-Square Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions
Scatterplot displays relationships between two continuous variables Correlation: Measure of how two variables co-occur and also can be used for prediction Range between -1 and +1 The closer to zero the weaker the relationship and the worse the prediction Positive or negative Review We’ve seen this slide before
Correlation Range between -1 and +1 +1.00 perfect relationship = perfect predictor +0.80 strong relationship = good predictor +0.20 weak relationship = poor predictor 0 no relationship = very poor predictor -0.20 weak relationship = poor predictor -0.80 strong relationship = good predictor -1.00 perfect relationship = perfect predictor Review We’ve seen this slide before
Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Height of Mothers by Height of Daughters Height ofMothers Positive Correlation Height of Daughters Review We’ve seen this slide before
Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Brushing teeth by number cavities BrushingTeeth Negative Correlation NumberCavities Review We’ve seen this slide before
Zero correlation • as values on one variable go up, values for the other variable • go... anywhere • pairs of observations tend to occupy seemingly random • relative positions • scatterplot shows no apparent slope
Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses For correlation null is that r = 0 (no relationship) Step 2: Decision rule • Alpha level? (α= .05 or .01)? • Critical statistic (e.g. critical r) value from table? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger then critical r then reject null Step 5: Conclusion - tie findings back in to research problem
Finding a statistically significant correlation • The result is “statistically significant” if: • the observed correlation is larger than the critical correlationwe want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) • the p value is less than 0.05 (which is our alpha) • we want our “p” to be small!! • we reject the null hypothesis • then we have support for our alternative hypothesis
Correlation - How do we calculate the exact r? Computational formula for correlation - abbreviated by r Pearson correlation coefficient (r): A number between -1.00 and =1.00 that describes the linear relationship between pairs of quantitative variables The formula:
Correlation - How do we calculate the exact r? We want to know the relationship between math ability and spelling ability. We gave 5 people a 20-point math test and a 20-point spelling test. . . . Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800
Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800 . First let’s draw a scatter plot
Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800 Correlation - Let’s do one Step 1: Find n n = 5 (5 pairs) Step 2: Find ΣX and ΣY Step 3: Find ΣXY Step 4: Find ΣX2 and ΣY2 Step 5: Plug in the numbers The formula:
Name Math(X) Spelling(Y) XY X2 Y2 KL 13 14 182 169 196 GC 9 18 162 81 324 JB 7 12 84 49 144 MD 5 10 50 25 100 RG 1 6 6 1 36 Σ 35 60 484 325 800 r = r = r = (320) [√[(1625)-(1225)] [√[(4000)-(3600)] [√[(5)(325)-(35)2] [√[(5)(800)-(60)2] 320 = [√400] [√400] 400 Step 5: Plug in the numbers The formula: (5)(484)-(35)(60) (2420)-(2100) r = .80
Make decision whether the correlation is different from zero α= 0.05 df = 3 Observed r(3) = 0.80 Critical r(3) = 0.878 Conclusion: r = 0.80 is not bigger than a r = .878 so not a significant r (not significantly different than zero – nothing going on) r(3) = 0.80; n.s.
Observed r(3) = 0.80 r(3) = 0.80; n.s. Critical r(3) = 0.878 Conclusion: r = 0.80 is not bigger than a r = .878 so not a significant r (not significantly different than zero – nothing going on) These data suggest a strong positive correlation between math ability and spelling ability, however this correlation was not large enough to reach significance, r(3) = 0.80; n.s.
α= 0.05 df = 50 Observed r(50) = 0.80 Critical r(50) = 0.273 r(50) = 0.80; p < 0.05. What if we had run more participants??
Conclusion: r = 0.80 is bigger than a r = .273 so there is a significant r (yes significantly different than zero – something going on) Observed r(50) = 0.80 Critical r(50) = 0.273 r(50) = 0.80; p < 0.05. These data suggest a strong positive correlation between math ability and spelling ability, and this correlation was large enough to reach significance, r(50) = 0.80; p < 0.05
Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Altitude by density of oxygen Altitude Negative Correlation Density of oxygen
Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Brushing teeth by number cavities BrushingTeeth Negative Correlation NumberCavities
Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down Ice creamsold by sunburns Amountof ice cream Positive Correlation Numbersunburns
Correlation Positive correlation: as values on one variable go up, so do values for the other variable Negative correlation: as values on one variable go up, the values for the other variable go down MPG by size of engine Miles perGallon Negative Correlation Size of Engine
Perfect correlation = +1.00 or -1.00 One variable perfectly predicts the other Height in inches and height in feet Speed (mph) and time to finish race Positive correlation Negative correlation Review We’ve seen this slide before
Correlation does not imply causation Is it possible that they are causally related? Yes, but the correlational analysis does not answer that question Remember crimes and bathrooms! Remember the birthday cakes! Number of Birthdays Number of Birthday Cakes
Linear vs curvilinear relationship Linear relationship is a relationship that can be described best with a straight line Curvilinear relationship is a relationship that can be described best with a curved line Review We’ve seen this slide before
Both variables are listed, as are direction and strength Both axes and values are labeled Both axes and values are labeled This shows the strong positive (.8) relationship between the heights of daughters (measured in inches) with heights of their mothers (measured in inches). 48 52 5660 64 68 72 1. Describe one positive correlation Draw a scatterplot (label axes) Height of Mothers (in) 2. Describe one negative correlation Draw a scatterplot (label axes) 48 52 56 60 64 68 72 76 Height of Daughters (inches) 3. Describe one zero correlation Draw a scatterplot (label axes) 4. Describe one perfect correlation (positive or negative) Draw a scatterplot (label axes) 5. Describe curvilinear relationship Draw a scatterplot (label axes) Review We’ve seen this slide before
These three have same scatter (none) But different slopes These three have same slope These three have same slope http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html http://argyll.epsb.ca/jreed/math9/strand4/scatterPlot.htm Let’s review the values of the correlation coefficient for each of the following Top Row: Variability differs (aka scatter or noise) These three have same slope These three have same slope But different scatter These three have same slope Middle Row: Slope differs Bottom Row: Non-linear relationships
The more closely the dots approximate a straight line, the stronger the relationship is. Correlation • Perfect correlation = +1.00 or -1.00 • One variable perfectly predicts the other • No variability in the scatter plot • The dots approximate a straight line
http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html http://argyll.epsb.ca/jreed/math9/strand4/scatterPlot.htm Let’s estimate the correlation coefficient for each of the following r = +.98 r = .20
http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html http://argyll.epsb.ca/jreed/math9/strand4/scatterPlot.htm Let’s estimate the correlation coefficient for each of the following r = +. 83 r = -. 63
http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html http://argyll.epsb.ca/jreed/math9/strand4/scatterPlot.htm Let’s estimate the correlation coefficient for each of the following r = +. 04 r = -. 43
Finding a statistically significant correlation • The result is “statistically significant” if: • the observed correlation is larger than the critical correlationwe want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) • the p value is less than 0.05 (which is our alpha) • we want our “p” to be small!! • we reject the null hypothesis • then we have support for our alternative hypothesis
Education Age IQ Income 0.38* Education -0.02 0.52* Age 0.38* -0.02 0.27* IQ 0.52* Income 0.27* Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables 1.0** 0.41* 0.65** 0.41* 1.0** 1.0** 0.65** 1.0** * p < 0.05 ** p < 0.01
Education Age IQ Income Correlation matrices Correlation matrix: Table showing correlations for all possible pairs of variables Education Age IQ Income 0.41* 0.38* 0.65** -0.02 0.52* 0.27* * p < 0.05 ** p < 0.01
Correlation matrices • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of X with X Correlation of Y with Y Correlation of Z with Z
Correlation matrices Does this correlation reach statistical significance? • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of X with Y Correlation of X with Y p value for correlation of X with Y p value for correlation of X with Y
Correlation matrices Does this correlation reach statistical significance? • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of X with Z Correlation of X with Z p value for correlation of X with Z p value for correlation of X with Z
Correlation matrices Does this correlation reach statistical significance? • Variable names • Make up any name that • means something to you • VARX = “Variable X” • VARY = “Variable Y” • VARZ = “Variable Z” Correlation of Y with Z Correlation of Y with Z p value for correlation of Y with Z p value for correlation of Y with Z
Correlation matrices What do we care about?
Thank you! See you next time!!