130 likes | 146 Views
Learn how to construct prediction intervals to estimate values in regression analysis with formulas and examples. Understand explained and unexplained variation in correlation, with coefficients of determination explained.
E N D
Lecture Slides Elementary StatisticsTwelfth Edition and the Triola Statistics Series by Mario F. Triola
Chapter 10Correlation and Regression 10-1 Review and Preview 10-2 Correlation 10-3 Regression 10-4 Prediction Intervals and Variation 10-5 Multiple Regression 10-6 Nonlinear Regression
Key Concept In this section we present a method for constructing a prediction interval, which is an interval estimate of a predicted value of y. (Interval estimates of parameters are confidence intervals, but interval estimates of variables are called prediction intervals.)
Requirements For each fixed value of x, the corresponding sample values of y are normally distributed about the regression line, with the same variance.
Formulas For a fixed and known x0, the prediction interval for an individual y value is: with margin of error:
Formulas The standard error estimate is: (It is suggested to use technology to get prediction intervals.)
Example If we use the 40 pairs of shoe lengths and heights, construct a 95% prediction interval for the height, given that the shoe print length is 29.0 cm. Recall (found using technology, data in Appendix B):
Example - Continued The 95% prediction interval is 162 cm < y < 186 cm. This is a large range of values, so the single shoe print doesn’t give us very good information about a someone’s height.
Explained and Unexplained Variation • Assume the following: • There is sufficient evidence of a linear correlation. • The equation of the line is • ŷ = 3 + 2x • The mean of the y-values is 9. • One of the pairs of sample data is x = 5 and y = 19. • The point (5,13) is on the regression line.
Explained and Unexplained Variation The figure shows (5,13) lies on the regression line, but (5,19) does not. We arrive at:
(total deviation) = (explained deviation) + (unexplained deviation) Relationships = + (total variation) = (explained variation) + (unexplained variation) = +
Definition The coefficient of determination is the amount of the variation in y that is explained by the regression line. The value of r2 is the proportion of the variation in y that is explained by the linear relationship between x and y.
Example If we use the 40 paired shoe lengths and heights from the data in Appendix B, we find the linear correlation coefficient to be r = 0.813. Then, the coefficient of determination is r2 = 0.8132 = 0.661 We conclude that 66.1% of the total variation in height can be explained by shoe print length, and the other 33.9% cannot be explained by shoe print length.