100 likes | 118 Views
Explore coefficient of determination, perform residual analysis, and identify influential observations in regression models. Learn how to assess the appropriateness of linear models and identify outliers.
E N D
Lesson 4 - 3 Diagnostics on the Least- Squares Regression Line
Objectives • Compute and interpret the coefficient of determination • Perform residual analysis on a regression model • Identify influential observations
Vocabulary • Coefficient of determination, R2– measures the percentage of total variation in the response variable explained by the least-squares regression line. • Deviations – differences between predicted value and actual value • Total Deviation – deviation between observed value, y, and mean of y, y-bar • Explained Deviation – deviation between predicted value, y-hat, and mean of y, y-bar • Unexplained Deviation – deviation between observed value, y, and predicted value, y-hat • Influential Observation – observation that significantly affects the value of the slope
Is the Linear Model Appropriate? • Patterned Residuals • If a plot of the residuals against the explanatory variable shows a discernible pattern, such as a curve, then the response and explanatory variable may not be linearly related • Variance of the Residuals Constant • If a plot of the residuals against the explanatory variable shows the spread of the residuals increasing or decreasing as the explanatory variable increases, then a strict requirement of the linear model is violated. This requirement is called constant error variance. • Influential Observations • Influential observations typically exist when the point is an outlier relative to its X-value • Outliers and Influential Observations • Remove only if there is justification to do so
Deviations and Predictions • The relationship is Total Deviation = Explained + Unexplained • The larger the explained deviation, the better the model is at prediction / explanation • The larger the unexplained deviation, the worse the model is at prediction / explanation
Identifying Outliers • From a scatter diagram • From a residual plot • From a boxplot
TI-83 Instructions for Residuals Plot • With diagnostics turned on and explanatory variable in L1 and response variable in L2 • Press STAT, highlight CALC and select 4: LinReg (ax + b) and hit enter twice • 2nd Y= (STAT PLOT) • Select Plot 1 • Choose scatter diagram icon • XList is L1 • Ylist is RESID by putting cursor on List 2, pressing 2nd Stat and choosing the list (scroll down) entitled RESID • Press Zoom and select 9: ZoomStat
TI-83 Instructions for Boxplots • With explanatory variable in L1 • Press 2nd Y= (STAT PLOT) • Select Plot 1 • Choose modified boxplot icon • XList is L1 • Press Zoom and select 9: ZoomStat
Summary and Homework • Summary: • Diagnostics are very important in assessing the quality of a least-squares regression model • The coefficient of determination measures the percent of total variation explained by the model • The plot of residuals can detect nonlinear patterns, error variances that are not constant, and outliers • We must be careful when there are influential observations because they have an unusually large effect on the computation of our model parameters • Homework:pg 235 – 239; 2, 5, 9, 11-15, 29
Homework Answers • 12 -- no pattern • 14 -- patterned residuals (parabolic)