110 likes | 130 Views
Learn about the cautions when using regression and correlation analysis. Understand the limitations of correlation, dangers of extrapolation, issues with lurking variables, influential data points, and more.
E N D
Cautions: Regression & Correlation • Correlation measures only linear association. • Extrapolation often produces unreliable predictions. • Correlation and least-squares regression are not resistant. • Lurking variables can make a correlation or regression misleading.
Residual Plots • A residual plot is a scatterplot of the regression residuals (i.e., errors) against the explanatory variable. • Residual plots make patterns in the original scatterplot of data more apparent. • If the regression catches the overall pattern of the data, there should be no evident pattern to the residuals.
Cautions: Regression & Correlation • Correlation measures only linear association. • Extrapolation often produces unreliable predictions. • Correlation and least-squares regression are not resistant. • Lurking variables can make a correlation or regression misleading.
Cautions: Regression & Correlation • Correlation measures only linear association. • Extrapolation often produces unreliable predictions. • Correlation and least-squares regression are not resistant. • Lurking variables can make a correlation or regression misleading.
Outliers & Influential Data Points • Remember, an outlier is an observation that lies outside the overall pattern of the other observations. • In a least-squares regression, does an outlier have to have a large residual?
Outliers & Influential Data Points • Points that are outliers in the y direction have large regression residuals. • Other outliers need not have large residuals.
Outliers & Influential Data Points • An observation is influential if removing it would markedly change the result of the regression. • Outliers in the x direction of a scatterplot are often influential in least-squares regression.
Cautions: Regression & Correlation • Correlation measures only linear association. • Extrapolation often produces unreliable predictions. • Correlation and least-squares regression are not resistant. • Lurking variables can make a correlation or regression misleading.
Lurking Variable • A lurking variable is a variable that is not among the explanatory and response variables, yet may influence the interpretation of the relationships among those variables. • Association does not imply causation! • A lurking variable may have a cause-and-effect relationship with the x and y variables, creating a strong association between x and y.