1 / 56

Warm up

Warm up. Use calculator to find r, , a , b. Chapter 8. LSRL-Least Squares Regression Line. AP Statistics Objectives Ch8. Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination .

gunnellc
Download Presentation

Warm up

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Warm up Use calculator to find r, , a , b

  2. Chapter 8 LSRL-Least Squares Regression Line

  3. AP Statistics Objectives Ch8 • Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination

  4. x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict y Bivariate data

  5. Fat Versus Protein: An Example • The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:

  6. The Linear Model • The correlation in this example is 0.83. It says “There seems to be a linear association between these two variables,” but it doesn’t tell what that association is. • We can say more about the linear relationship between two quantitative variables with a model. • A model simplifies reality to help us understand underlying patterns and relationships.

  7. The Linear Model (cont.) • The linear model is just an equation of a straight line through the data. • The points in the scatterplot don’t all line up, but a straight line can summarize the general pattern with only a couple of parameters. • The linear model can help us understand how the values are associated.

  8. b – is the slope it is the amount by which y increases when x increases by 1 unit a – is the y-intercept it is the height of the line when x = 0 in some situations, the y-intercept has no meaning - (y-hat) means the predictedy Be sure to put the hat on the y

  9. The line that gives the best fit to the data set The line that minimizes the sum of the squares of the deviations from the line Least Squares Regression LineLSRL

  10. Scattergram • 1. Plot of All (Xi, Yi) Pairs • 2. Suggests How Well Model Will Fit • Y • 60 • 40 • 20 • 0 • X • 0 • 20 • 40 • 60

  11. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  12. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  13. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  14. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  15. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  16. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  17. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

  18. (3,10) y =.5(6) + 4 = 7 2 – 7 = -5 4.5 y =.5(0) + 4 = 4 0 – 4 = -4 -5 y =.5(3) + 4 = 5.5 10 – 5.5 = 4.5 -4 (6,2) (0,0) (0,0) Sum of the squares = 61.25

  19. Residuals • The model won’t be perfect, regardless of the line we draw. • Some points will be above the line and some will be below. • The estimate made from a model is the predicted value (denoted as ).

  20. Residuals (cont.) • The difference between the observed value and its associated predicted value is called the residual. • To find the residuals, we always subtract the predicted value from the observed one:

  21. (3,10) 6 Find y - y -3 (6,2) -3 (0,0) What is the sum of the deviations from the line? Will it always be zero? Use a calculator to find the line of best fit The line that minimizes the sum of the squares of the deviations from the line is the LSRL. Sum of the squares = 54

  22. Types of Regression Models Positive Linear Relationship Relationship NOT Linear Negative Linear Relationship No Relationship

  23. Interpretations Slope: For each unit increase in x, there is an approximateincrease/decrease of b in y.

  24. The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Find the LSRL. Interpret the slope and correlation coefficient in the context of the problem.

  25. Interpretations Slope: For each unit increase in x, there is an approximateincrease/decrease of b in y.

  26. Correlation coefficient: There is a _________________ association between the________ Slope: For an increase in _______________, there is an approximate______ of _______________________________

  27. Correlation coefficient: There is a strong, positive, linear association between the age and height of children. Slope: For an increase in age of one month, there is an approximateincrease of .34 inches in heights of children.

  28. The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Predict the height of a child who is 4.5 years old. Predict the height of someone who is 20 years old.

  29. The LSRL should not be used to predict y for values of x outside the data set. It is unknown whether the pattern observed in the scatterplot continues outside this range. Extrapolation

  30. The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Calculate x & y. Plot the point (x, y) on the LSRL. Will this point always be on the LSRL?

  31. The correlation coefficient and the LSRL are both non-resistant measures.

  32. Formulas – on chart

  33. The following statistics are found for the variables posted speed limit and the average number of accidents. Find the LSRL & predict the number of accidents for a posted speed limit of 50 mph.

  34. Chapter 8

  35. Chapter 8

  36. Chapter 8

  37. Chapter 8

  38. Chapter 8

  39. Chapter 8

  40. Chapter 8

  41. Chapter 8

  42. Chapter 8 R2 – Must also be interpreted when describing a regression model “With the linear regression model, _____% of the variability in _______ (response variable) is accounted for by variation in ________ (explanatory variable)” The remaining variation is due to the residuals

  43. Examples of Approximate R2 Values y R2 = 1 Perfect linear relationship between x and y: 100% of the variation in y is explained by variation in x x R2 = 1 y x R2 = +1

  44. Examples of Approximate R2 Values y 0 < R2 < 1 Weaker linear relationship between x and y: Some but not all of the variation in y is explained by variation in x x y x

  45. Examples of Approximate R2 Values R2 = 0 y No linear relationship between x and y: The value of Y does not depend on x. (None of the variation in y is explained by variation in x) x R2 = 0

  46. Did you say 2? Wrong. Try again. So what?

  47. Important Note: The correlation is not given directly in this software package. You need to look in two places for it. Taking the square root of the “R squared” (coefficient of determination) is not enough. You must look at the sign of the slope too. Positive slope is a positive r-value. Negative slope is a negative r-value.

  48. So here you should note that the slope is positive. The correlation will be positive too. Since R2 is 0.482, r will be +0.694.

  49. Coefficient of Determination = (0.694)2 = 0.4816

  50. 0.4816 With the linear regression model, 48.2% of the variability in airline fares is accounted for by the variation in distance of the flight.

More Related