1 / 14

Chapter 8 Homework

Chapter 8 Homework . # 4, 6, 8, 17, 28, 32, and 36. #4. The curved pattern in the residuals plot indicates that the linear model is not appropriate. The relationship is not linear.

Download Presentation

Chapter 8 Homework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 8 Homework # 4, 6, 8, 17, 28, 32, and 36

  2. #4 • The curved pattern in the residuals plot indicates that the linear model is not appropriate. The relationship is not linear. • The fanned pattern indicates heteroscedastic data. The model’s predicting power increases as the values of the explanatory variable increase. • The scattered residual plot indicates an appropriate linear model.

  3. #6 • The 4 x-values are plugged into y-hat = 1975 – 0.45x, the 4 predicted values are y-hat = 1885, 1795, 1705, and 1615, respectively. • The 4 residuals are 65, -145, 95, and -15. • The squared residuals are 4225, 21025, 9025, and 225, respectively. • The sum of the squared residuals is 34,500. • Least squares means that no other line has a sum lower than 34,500. In other words, it’s the best fit.

  4. #8 • The linear model is appropriate. Although the relationship is not strong, it is reasonably straight, and the residuals plot shows no pattern. • 33.3% of the variability in attendance can be explained by variability in the number of wins. • The correlation between attendance and number of wins is r. Which is the square root of R2 = 0.577. Since the relationship is positive, the sign of r is positive.

  5. #8 (cont'd) • A team that is two standard deviations above the mean in number of wins would be expected to have attendance that is 1.154 (or 2 x 0.577) standard deviations above the mean attendance.

  6. #8 (cont'd) • A team that is one standard deviation below the mean in attendance would be expected to have a number of wins that is 0.577 standard deviations (in other words, r standard deviations) below the mean number of wins. The correlation between two variables is the same, regardless of the direction in which predictions are made. Be careful, though, since the same is NOT true for predictions made using the slope of the regression equation. Slopes are valid only for predictions in the direction for which they were intended.

  7. #17 • There is a moderate , positive, linear association between SAT Math and SAT Verbal scores. • One student got a 500 Verbal and 800 Math. That set of scores doesn’t seem to fit the pattern. • r = 0.685 indicates a moderate, positive association between SAT Math and SAT Verbal, but only because the scatterplot shows a linear relationship. • Using the “Summary Statistics” formulas: • Math-hat = 217.692 + 0.662 (Verbal)

  8. #17 (cont'd) • For each additional point in verbal score, the model predicts an increase of 0.662 points in math score. • Since SAT scores are in increments of 10, you may scale it to…for each additional 10 points in verbal score, the model predicts an increase of 6.62 points in math score. • The predicted math score is 548.692. • The predicted math score is 747.292. If her actual math score was 800, her residual is 800 – 747.292 = 57.708 points.

  9. #28 • The association between cost of living in 2000 and 2001 is stong, positive, and linear. • R2 = (0.957)2 = 0.9158. This means that 91.6% of the variability in cost of living in 2001 can be explained by the variability in cost of living in 2000. • Moscow had a cost of living of 136.1% of New York’s in 2000. The model predicts Moscow should be 119.3%. So the residual is 136.1% (actual) – 119.3% (predicted) = +13.1%.

  10. #28 (cont'd) • The residual means that the actual cost of living in Moscow was more than what the model predicts. The model underestimated.

  11. #32 • A scatterplot of the live birth rates over time shows a negative, strong, curved relationship. • Although it is slightly curved, it is straight enough to try a linear model. The linear regression model is: • Birthrate-hat = 246.051 – 0.1159 (Year) • The residuals plot shows a slight curve. The linear model may not be appropriate. We will proceed with caution.

  12. #32 (cont'd) • The model predicts that each passing year is associated with a decline in birth rate of 0.116 births per 100 women. • The model predicts 16.73 births per 1000 women in 1978. • The residual is – 1.73 births per 1000 women. This means the model predicted 1.73 births higher than the actual rate.

  13. #32 (cont'd) • The model predicts the birth rate in 2005 to be 13.60 births per 1000 women. This seems low…it is an extrapolation outside the range and the model only predicts 61% of the variation. Don’t put much faith in this estimate. • The model predicts 11.86 births per 1000 women in 2020. This is an extreme extrapolation which is dangerous. No faith should be placed in this prediction.

  14. #36 • Weight is the proper dependent (y) variable. The researchers will use length to predict weight. • r = 0.914 • Weight-hat = -393 + 5.9 (Length) • For each additional inch in length, the model predicts an increase of 5.9 pounds in weight. • The estimates should be fairly accurate. The model predicts 83.6% of the variability in weight. However, care should be taken. With no scatterplot and not residuals plot, we cannot verify the condition of linearity. There may be a curved association, in which case, the linear model is not appropriate.

More Related