200 likes | 308 Views
Chris Morgan, MATH G160 csmorgan@purdue.edu April 20, 2012 Lecture 32. Chapter 2.4: Scatter Plots and Trend Lines. Key Points of a Scatter Plot. Shows the relationship between two quantitative variables Explanatory (independent variable) ALWAYS goes on x-axis (horizontal)
E N D
Chris Morgan, MATH G160 csmorgan@purdue.edu April 20, 2012 Lecture 32 Chapter 2.4: Scatter Plots and Trend Lines
Key Points of a Scatter Plot Shows the relationship between two quantitative variables Explanatory (independent variable) ALWAYS goes on x-axis (horizontal) Response (dependent variable) ALWAYS goes on y-axis (vertical)
Correlation Correlation indicates relationship between two variables. The correlation r always falls between -1 and 1. Correlation around 0 indicates no relationship. Around 1 indicates a strong positive relationship. Around -1 indicates a strong negative relationship. Formula for r:
Determining Positive and Negative Correlation when looking at the Scatter Plot • POSITIVE: an increase in one variable also yields an increase in the other variable. • Scatterplot slopes upward from left to right. • NEGATIVE: an increase in one variable yields a decrease in the other variable. • Scatterplot slopes downward from the left to the right.
Correlation • NOTE: Correlation can change drastically when just a few points are added. • Applet Can be found here: • www.whfreeman.com/scc7e • Click on “Statistical Applets” and then on “Correlation and Regression Applet”
Example • Scatterplot in Excel: • College GPA vs Median Income at 30 • ESSENTIALS FOR A GOOD SCATTERPLOT: • A good title that Describes the Data • Labels on each Axes (both the x-axis and the y-axis)
Trendlines • If we have a trendline, Y=mx + b then: • - m is the slope • - b is the y-intercept • - x is the value of the X-variable • - Y is the predicted value of the response
R-Squared in Regression R2 is the percent of variability in Y (response) that is explained by X (predictor). The square root of r-squared is the correlation (r).
What is the slope? • What is the Y-intercept? • What is the value of R²? Interpret it. • What is the value of r? Interpret it. • What is the expected increase in median income for an increase of 0.5 in GPA?
Interpolation versus Extrapolation • Suppose you’d like to predict the median income for a person with a GPA of 3.1? Is this prediction valid? Why or why not? If it is appropriate, find the predicted value. • Suppose you’d like to predict the median income for a person with a GPA of 0.8? Is this prediction valid? Why or why not? If it is appropriate, find the predicted value.
Statistics and Causation A strong relationship between two variables does not always mean that changes in one variable causes changes in the other. The relationship between two variables can often have a lurking variable in the background that influences them both.
"Correlation does not imply causation" Examples of Confounding and Lurking variables: Ice-cream sales and people running through sprinklers Carrying matches and being diagnosed with lung cancer