160 likes | 276 Views
Inferenence for Regression. Chapter 14: Inference about the Linear Model. Inference about the Model.
E N D
Inferenence for Regression Chapter 14: Inference about the Linear Model
Inference about the Model • When a scatterplot showes a linear relationship between a quantitative explanatory variable x and a quantitative response variable y, we can use the LSRL fitted to the data to predict y for a given value of x.
ALWAYS • PLOT THE GIVEN DATA : Use a scatterplot to look for form, direction, and strength of the relationship as well as for outliers or other deviations. • Look AT THE NUMERICAL SUMMARY: • The correlation describes the direction and strength of the relationship (r). • The coefficient of determination describes the % of variation in y variable that is explained by the x variable (r2). • Find the LSRL for predicting, ŷ = a + bx • Remember, predicting to far from the given data model can give bad predictions. This is called extrapolation.
Conditions for Regression Inference Before constructing a confidence interval or performing a significance test on the slope, we must check the following conditions: • Observed ordered pairs (response values) are independent of each other. • The true relationship is linear. (Is the scatterplot roughly linear?) • The standard deviation of the response is constant. (Is the scatter about the LSRL consistent?) • The response varies Normally about the true regression line. (Are the residuals approximately normally distributed?)
True Regression Line • The LSRL we calculated is only an estimate of the true relationship between the variables. If we could measure ALL specimens, we would get another line... μy = α + βx • When the regression model describes our data and we calculate the LSRL, ŷ = a + bx, the slope b is an unbiased estimator of the true slope β, and the intercept is an unbiased estimator of the true intercept α.
Standard Error about the LSRL • s = • Use s to estimate the unknown σ in the regression model. • Degrees of Freedom → (n – 2)
Archaeopteryx • r = .994, we have a strong linear relationship • between the femur and humerus Ŷ = 1.1969x – 3.660. • b) β which is estimated by 1.1969 says that “As the • femur increases by 1inch the humerous increases by • 1.1969 inches starting at – 3.660inches (α).
c) Residual (actual – predicted) S = S = 1.982 We now have the 3 parameters of our model α, β, σ
14.6 p794 (cont 14.1) • The equation of the LSRL of Humerus on the length of the • Femur Ŷ = 1.1969x – 3.660. • b) Using the output and what we can know about t-test • statistic is that t=b/SEb then t=1.1969/0.0751=15.937 • c) df = 5-2 = 3, the p-value of t = 15.937 is off the chart to the right, therefore there is sufficient evidence to say that the data between Femur and Humerus has a positive, straight line relationship. • d) As the Femur increases by 1 the Humerus is increasing by 1.1969 inches.
Confidence Interval for Regression Slope • A level C confidence interval for β is given by b ± t *SEb • The standard error of the least-squares slope b is: SEb = • and t* is determined using (n-2) degrees of freedom. • You will rarely have to calculate the SE by hand. We’ll learn how to use computer output or our calculator to provide that for us...
14.6 e • CI → b ± t *SEb • CI→1.1969 ± (5.841* .0751) • CI→ (.7582, 1.636) • We are 99% confident that the mean increase of femur to humerus is between .7582 inches and 1.636 inches.
The Leaning Tower of Pisa • The scatterplot shows a strong • fairly linear relationship between • year and the lean of the tower. b) The parameter that gives the rate at which the tilt is increasing is β (slope). It seems that the tower is leaning at a rate of 9.318 mm per year.
#11c) Conficence Interval CI → b ± t *SEb CI → 9.3187 ± (2.201*.3099) CI → (8.63, 10.001) We are 95% confident that the true rate the tower is leaning is between 8.63mm per year and 10.001mm per year.