480 likes | 704 Views
Statistics for the Social Sciences. Psychology 340 Fall 2006. Relationships between variables. Correlation. Write down what (you think) a correlation is. Write down an example of a correlation Association between scores on two variables
E N D
Statistics for the Social Sciences Psychology 340 Fall 2006 Relationships between variables
Correlation • Write down what (you think) a correlation is. • Write down an example of a correlation • Association between scores on two variables • Age and coordination skills in children, as kids get older their motor coordination tends to improve • Price and quality, generally the more expensive something is the higher in quality it is
Correlation and Causality • Correlational research design • Correlation as a kind of research design (observational designs) • Correlation as a statistical procedure
One might argue that turbulents cause coffee spills One might argue that spilling coffee causes turbulents Another thing to consider about correlation • Correlations describe relationships between two variables, but DO NOT explain why the variables are related Suppose that Dr. Steward finds that rates of spilled coffee and severity of plane turbulents are strongly positively correlated.
One might argue that bigger your head, the larger your digit span 1 24 37 21 15 One might argue that head size and digit span both increase with age (but head size and digit span aren’t directly related) AGE Another thing to consider about correlation • Correlations describe relationships between two variables, but DO NOT explain why the variables are related Suppose that Dr. Cranium finds a positive correlation between head size and digit span (roughly the number of digits you can remember).
Another thing to consider about correlation • Correlations describe relationships between two variables, but DO NOT explain why the variables are related For many years instructors have noted that the reported fatality rate of grandparents increases during midterm and final exam periods. One might argue that college exams cause grandparent death
Relationships between variables • Properties of a correlation • Form (linear or non-linear) • Direction (positive or negative) • Strength (none, weak, strong, perfect) • To examine this relationship you should: • Make a scatterplot - a picture of the relationship • Compute the Correlation Coefficient - a numerical description of the relationship
Graphing Correlations • Steps for making a scatterplot (scatter diagram) • Draw axes and assign variables to them • Determine range of values for each variable and mark on axes • Mark a dot for each person’s pair of scores
Y 6 5 4 3 2 1 X 1 2 3 4 5 6 Scatterplot • Plots one variable against the other • Each point corresponds to a different individual A 6 6
Scatterplot • Plots one variable against the other • Each point corresponds to a different individual Y 6 A 6 6 5 B 1 2 4 3 2 1 X 1 2 3 4 5 6
Scatterplot • Plots one variable against the other • Each point corresponds to a different individual Y 6 A 6 6 5 B 1 2 4 C 5 6 3 2 1 X 1 2 3 4 5 6
Scatterplot • Plots one variable against the other • Each point corresponds to a different individual Y 6 A 6 6 5 B 1 2 4 C 5 6 3 D 3 4 2 1 X 1 2 3 4 5 6
Scatterplot • Plots one variable against the other • Each point corresponds to a different individual Y 6 A 6 6 5 B 1 2 4 C 5 6 3 D 3 4 2 1 E 3 2 X 1 2 3 4 5 6
Scatterplot • Plots one variable against the other • Imagine a line through the data points • Each point corresponds to a different individual Y 6 A 6 6 5 B 1 2 4 C 5 6 3 • Useful for “seeing” the relationship • Form, Direction, and Strength D 3 4 2 1 E 3 2 X 1 2 3 4 5 6
Linear Non-linear Form
Y Y X X Positive Negative Direction • X & Y vary in the same direction • As X goes up, Y goes up • Positive Pearson’s r • X & Y vary in opposite directions • As X goes up, Y goes down • Negative Pearson’s r
Strength • The strength of the relationship • Spread around the line (note the axis scales) • Correlation coefficient will range from -1 to +1 • Zero means “no relationship” • The farther the r is from zero, the stronger the relationship
r = 1.0 “perfect positive corr.” r2 = 100% r = -1.0 “perfect negative corr.” r2 = 100% r = 0.0 “no relationship” r2 = 0.0 -1.0 0.0 +1.0 The farther from zero, the stronger the relationship Strength
The Correlation Coefficient • Formulas for the correlation coefficient: Used this one in PSY138 Common alternative
The Correlation Coefficient • Formulas for the correlation coefficient: Used this one in PSY138 Common alternative
X Y 6 6 1 2 5 6 3 4 3 2 Computing Pearson’s r (using SP) • Step 1: SP (Sum of the Products) 3.6 4.0 mean
= 1 - 3.6 -2.6 = 5 - 3.6 1.4 = 3 - 3.6 -0.6 -0.6 = 3 - 3.6 Quick check Computing Pearson’s r (using SP) • Step 1: SP (Sum of the Products) X Y = 6 - 3.6 6 6 2.4 1 2 5 6 3 4 3 2 3.6 4.0 0.0 mean
2.0 = 6 - 4.0 = 2 - 4.0 -2.0 2.0 = 6 - 4.0 = 4 - 4.0 0.0 = 2 - 4.0 -2.0 Quick check Computing Pearson’s r (using SP) • Step 1: SP (Sum of the Products) X Y 6 6 2.4 -2.6 1 2 5 6 1.4 3 4 -0.6 3 2 -0.6 3.6 4.0 0.0 0.0 mean
4.8 = = = = = * * * * * 5.2 2.8 0.0 1.2 Computing Pearson’s r (using SP) • Step 1: SP (Sum of the Products) XY 6 6 2.4 2.0 -2.6 -2.0 1 2 5 6 1.4 2.0 3 4 -0.6 0.0 3 2 -0.6 -2.0 3.6 4.0 0.0 0.0 14.0 SP mean
Computing Pearson’s r (using SP) • Step 2: SSX & SSY
2 2 2 2 2 = = = = = 6.76 1.96 0.36 0.36 SSX Computing Pearson’s r (using SP) • Step 2: SSX & SSY XY 6 6 2.4 2.0 4.8 5.76 -2.6 -2.0 5.2 1 2 5 6 1.4 2.0 2.8 3 4 -0.6 0.0 0.0 3 2 -0.6 -2.0 1.2 3.6 4.0 0.0 15.20 0.0 14.0 mean
2 2 2 2 2 = = = = = 4.0 4.0 4.0 0.0 4.0 SSY Computing Pearson’s r (using SP) • Step 2: SSX & SSY XY 6 6 2.4 2.0 4.8 5.76 -2.6 6.76 -2.0 5.2 1 2 5 6 1.4 1.96 2.0 2.8 3 4 -0.6 0.36 0.0 0.0 3 2 -0.6 0.36 -2.0 1.2 3.6 4.0 0.0 15.20 0.0 16.0 14.0 mean
Computing Pearson’s r (using SP) • Step 3: compute r
SSY SSX Computing Pearson’s r (using SP) • Step 3: compute r XY 6 6 2.4 2.0 4.8 4.0 5.76 -2.6 6.76 -2.0 4.0 5.2 1 2 5 6 1.4 1.96 2.0 4.0 2.8 3 4 -0.6 0.36 0.0 0.0 0.0 3 2 -0.6 0.36 -2.0 4.0 1.2 3.6 4.0 0.0 15.20 0.0 16.0 14.0 SP mean
SSY SSX Computing Pearson’s r • Step 3: compute r 15.20 16.0 14.0 SP
SSY SSX Computing Pearson’s r • Step 3: compute r 15.20 16.0
SSX Computing Pearson’s r • Step 3: compute r 15.20
Computing Pearson’s r • Step 3: compute r
Y 6 5 4 3 2 1 X 1 2 3 4 5 6 Computing Pearson’s r • Step 3: compute r • Appears linear • Positive relationship • Fairly strong relationship • .89 is far from 0, near +1
The Correlation Coefficient • Formulas for the correlation coefficient: Used this one in PSY138 Common alternative
X Y 6 6 1 2 5 6 3 4 3 2 Computing Pearson’s r (using z-scores) • Step 1: compute standard deviation for X and Y (note: keep track of sample or population) • For this example we will assume the data is from a population
2.4 -2.6 6.76 1.4 1.96 -0.6 0.36 -0.6 0.36 3.6 0.0 15.20 Mean SSX Computing Pearson’s r (using z-scores) • Step 1: compute standard deviation for X and Y (note: keep track of sample or population) • For this example we will assume the data is from a population X Y 6 6 5.76 1 2 5 6 3 4 3 2 1.74 Std dev
2.0 -2.0 4.0 2.0 4.0 0.0 0.0 -2.0 4.0 0.0 16.0 SSY Computing Pearson’s r (using z-scores) • Step 1: compute standard deviation for X and Y(note: keep track of sample or population) • For this example we will assume the data is from a population X Y 6 6 2.4 4.0 5.76 -2.6 6.76 1 2 5 6 1.4 1.96 3 4 -0.6 0.36 3 2 -0.6 0.36 3.6 4.0 15.20 Mean 1.74 1.79 Std dev
Computing Pearson’s r (using z-scores) • Step 2: compute z-scores X Y 6 6 2.4 2.0 4.0 1.38 5.76 -2.6 6.76 -2.0 4.0 1 2 5 6 1.4 1.96 2.0 4.0 3 4 -0.6 0.36 0.0 0.0 3 2 -0.6 0.36 -2.0 4.0 3.6 4.0 15.20 16.0 Mean 1.74 1.79 Std dev
Quick check Computing Pearson’s r (using z-scores) • Step 2: compute z-scores X Y 6 6 2.4 2.0 4.0 1.38 5.76 -2.6 6.76 -2.0 4.0 -1.49 1 2 5 6 1.4 1.96 2.0 4.0 0.8 3 4 -0.6 0.36 0.0 0.0 - 0.34 3 2 -0.6 0.36 -2.0 4.0 - 0.34 3.6 4.0 15.20 16.0 0.0 Mean 1.74 1.79 Std dev
Computing Pearson’s r (using z-scores) • Step 2: compute z-scores X Y 6 6 2.4 2.0 4.0 1.38 1.1 5.76 -2.6 6.76 -2.0 4.0 -1.49 1 2 5 6 1.4 1.96 2.0 4.0 0.8 3 4 -0.6 0.36 0.0 0.0 - 0.34 3 2 -0.6 0.36 -2.0 4.0 - 0.34 3.6 4.0 15.20 16.0 Mean 1.74 1.79 Std dev
Quick check Computing Pearson’s r (using z-scores) • Step 2: compute z-scores X Y 6 6 2.4 2.0 4.0 1.38 1.1 5.76 -2.6 6.76 -2.0 4.0 -1.49 -1.1 1 2 5 6 1.4 1.96 2.0 4.0 0.8 1.1 3 4 -0.6 0.36 0.0 0.0 - 0.34 0.0 3 2 -0.6 0.36 -2.0 4.0 - 0.34 -1.1 3.6 4.0 15.20 16.0 0.0 Mean 1.74 1.79 Std dev
= * Computing Pearson’s r (using z-scores) • Step 3: compute r X Y 6 6 2.4 2.0 4.0 1.38 1.1 1.52 5.76 -2.6 6.76 -2.0 4.0 -1.49 -1.1 1 2 5 6 1.4 1.96 2.0 4.0 0.8 1.1 3 4 -0.6 0.36 0.0 0.0 - 0.34 0.0 3 2 -0.6 0.36 -2.0 4.0 - 0.34 -1.1 3.6 4.0 0.0 15.20 0.0 16.0 0.0 0.0 Mean 1.74 1.79 Std dev
Computing Pearson’s r (using z-scores) • Step 3: compute r X Y 6 6 2.4 2.0 4.0 1.38 1.1 1.52 5.76 -2.6 6.76 -2.0 4.0 -1.49 -1.1 1.64 1 2 5 6 1.4 1.96 2.0 4.0 0.8 1.1 0.88 3 4 -0.6 0.36 0.0 0.0 - 0.34 0.0 0.0 3 2 -0.6 0.36 -2.0 4.0 - 0.34 -1.1 0.37 3.6 4.0 0.0 15.20 0.0 16.0 0.0 0.0 4.41 Mean 1.74 1.79 Std dev
Y 6 5 4 3 2 1 X 1 2 3 4 5 6 Computing Pearson’s r (using z-scores) • Step 3: compute r • Appears linear • Positive relationship • Fairly strong relationship • .89 is far from 0, near +1
A few more things to consider about correlation • Correlations are greatly affected by the range of scores in the data • Consider height and age relationship • Extreme scores can have dramatic effects on correlations • A single extreme score can radically change r • When considering "how good" a relationship is, we really should consider r2 (coefficient of determination), not just r.
Why have a “-”? Acculturation • Why only half the table filled with numbers? Correlation in Research Articles • Correlation matrix • A display of the correlations between more than two variables
Next time • Predicting a variable based on other variables