240 likes | 284 Views
Introduction to Quantitative Data Analysis (continued). Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12. Recall: Correlation. Correlation is used to measure and describe a relationship between two variables.
E N D
Introduction to Quantitative Data Analysis (continued) Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12
Recall: Correlation • Correlation is used to measure and describe a relationship between two variables. • Usually these two variables are simply observed as they exist in the environment; (no attempt to control or manipulate the variables).
Correlation • The correlation coefficient measures three characteristics of the relationship between X and Y: • The direction of the relationship. • The form of the relationship. • The degree of the relationship. • Pearson’s r • called Pearson product-moment correlation coefficient in the textbook (p. 290)
Scatter Plot • What is the relationship between level of education and lifetime earnings?
Scatter Plot • Designate one variable X and the other Y. • Although in some cases it does not matter which is which, in cases where one variable is used to predict the other, X is the “predictor” variable (the variable you’re predicting from—independent variable in hypothesis). • Draw axes of equal length for your graph. • Determine the range of values for each variable. Place the high values of X to the right on the horizontal axis and the high values of Y toward the top of the vertical axis. Label convenient points along each axis. • For each pair of scores, find the point of intersection for the X and Y values and indicate it with a dot. • Label each axis and give the entire graph a name.
Direction of Relationship • A scatter plot shows at a glance the direction of the relationship. • A positive correlation appears as a cluster of data points that slopes from the lower left to the upper right.
Positive Correlation • If the higher scores on X are generally paired with the higher scores on Y, and the lower scores on X are generally paired with the lower scores on Y, then the direction of the correlation between two variables is positive.
Direction of Relationship • A scatter plot shows at a glance the direction of the relationship. • A negative correlation appears as a cluster of data points that slopes from the upper left to the lower right.
Negative Correlation • If the higher scores on X are generally paired with the lower scores on Y, and the lower scores on X are generally paired with the higher scores on Y, then the direction of the correlation between two variables is negative.
No Correlation • In cases where there is no correlation between two variables (both high and low values of X are equally paired with both high and low values of Y), there is no direction in the pattern of the dots. • They are scattered about the plot in an irregular pattern.
Perfect Correlation • When there is a perfect linear relationship, every change in the X variable is accompanied by a corresponding change in the Y variable.
Form of Relationship • Pearson’s r assumes an underlying linear relationship (a relationship that can be best represented by a straight line). • Not all relationships are linear.
Strength of Relationship • How can we describe the strength of the relationship in a scatter plot? • A number between -1 and +1 that indicates the relationship between two variables. • The sign (- or +) indicates the direction of the relationship. • The number indicates the strength of the relationship. -1 ------------ 0 ------------ +1 Perfect Relationship No Relationship Perfect Relationship The closer to –1 or +1, the stronger the relationship.
Pearson’s r • Definitional formula: Computational formula:
An Example: Correlation • What is the relationship between level of education and lifetime earnings?
An Example: Correlation • Researchers who measure reaction time for human participants often observe a relationship between the reaction time scores and the number of errors that the participants commit. This relationship is known as the speed-accuracy tradeoff. The following data are from a reaction time study where the researcher recorded the average reaction time (milliseconds) and the total number of errors for each individual in a sample of 8 participants. Calculate the correlation coefficient.
Interpreting Pearson’s r • Values can be influenced by the range of scores.
Interpreting Pearson’s r • Values can be influenced by outliers.
Interpreting Pearson’s r • Correlation does not equal causation. • Can tell you the strength and direction of a relationship between two variables but not the nature of the relationship. • The third variable problem. • The directionality problem.