140 likes | 236 Views
Chapter 14 – Correlation and Simple Regression. Math 22 Introductory Statistics. Numerical Vs. Numerical Variables. Scatterplot - Way to display bivariate data. Scatterplots.
E N D
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics
Numerical Vs. Numerical Variables • Scatterplot - Way to display bivariate data.
Scatterplots A scatterplot is useful for displaying trends in the data and revealing any association that might exist between the two variables of bivariate data. If the two variables are related, we then ask: • what is the form of the relationship? • how strong is the relationship? • can we predict the value of one variable from the other?
Scatterplots • “Positive” Relationships • “Negative” Relationships
Pearson’s Correlation • Pearson Correlation coefficient (r) • Measure the strength of the linear relationship between the variables x and y.
If r = 1 or r = -1 If r = 0 Perfect Linear Correlation No linear correlation. x and y are not linearly related, but they still may be related in another way. Interpretation of r
Interpretation of r • The closer r is to 1 or -1, the stronger the linear relationship between x and y.
Positive and Negative Correlation • Positive Correlation (r > 0) As one variable increases, the other increases. As one variable decreases, the other decreases. • Negative Correlation (r < 0) As one variable increases, the other decreases (and visa-versa).
Correlation and Causation • A large positive correlation between two variables means that large values of one variable tend to be associated with large values of the other variable. This does not necessarily mean that the large values of the first variable caused the large values of the other variable.
Correlation and Causation • The same could be said of negative correlation. The large values of one variable do not necessarily cause the small values of the other variable (and visa-versa).
Confidence Interval for the Population Correlation Coefficient • We have the sample correlation coefficient, r. • Since we have this statistic, we can estimate the population correlation coefficient • We can do this by obtaining a confidence interval for based on r.
Spearman’s Rank Correlation • Uses rank to determine correlation. • Determines the degree to which a monotonic relationship exists between the two variables. • Unlike Pearson’s correlation coefficient, Spearman’s correlation coefficient can measure certain nonlinear trends.
Response and Predictor Variable • Dependent Variable (y) Variable we wish to predict or describe based on the values of another variable. • Independent Variable (x) Variable that is used to predict the response variable.
Regression • Least Squares Criterion • Equation of a Straight Line • Least Square Regression Line • Graphing the Least Squares Line with the Scatterplot • Residual (Error) • Residual Plot