1 / 28

Linear Regression

Linear Regression. Essentials Line Basics y = mx + b vs. Definitions Scatter Plot & Regression Line Notation & Formulae Regression Considerations Line of Best Fit – Least Squares Line Example. E ssentials: Regression (Predictions based upon the known.).

jcrain
Download Presentation

Linear Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Regression • Essentials • Line Basics • y = mx + b vs. • Definitions • Scatter Plot & Regression Line • Notation & Formulae • Regression Considerations • Line of Best Fit – Least Squares Line • Example

  2. Essentials:Regression(Predictions based upon the known.) • Understand what the regression process does - prediction. • Be able to state the steps we use leading up to the decision to conduct regression. • Be able to calculate the slope of a line and the y-intercept. • Be able to calculate a regression equation and apply it to the prediction of other values. Know that these are estimates, not necessarily the actual values that might occur. • Know what the Least Squares Property and Line of Best Fit. Residual – what’s that?

  3. A Linear Equation in One Independent Variable y = mx + b b is the y-intercept (the point at which the line intersects the y-axis). It is the value of y when x = 0. y is the dependent variable (also called the response variable). Its value depends on the value of x. x is the independent variable (also known as the predictor variable.) m is the slope of the line. The slope indicates how much the y-value increases (or decreases if the slope is negative) when the x-value increases by 1 unit. When m is positive, the line will have an upward slope. When m is negative, the line will have a downward slope.

  4. y 5 4 3 2 1 -4 -3 -2 -1 1 2 3 4 x -1 -2 -3 -4 -5

  5. y 5 4 3 2 1 x -4 -3 -2 -1 1 2 3 4 -1 -2 -3 -4 -5 . . (-1, 4) (-2, 2) -x -y

  6. y . 5 4 3 2 1 . y=mx+b y=2x+1 . . -4 -3 -2 -1 1 2 3 4 x -1 -2 -3 -4 -5 . .

  7. The Regression Equation x is the independent variable (predictor variable) ^ y is the dependent variable (response variable) ^ Where: b0 = y intercept b1 = slope y = b0 +b1x (recall, y = mx +b )

  8. ^ y= b0 + b1x Regression Definitions Regression Equation Given a collection of paired data, the regression equation algebraically describes the relationship between the two variables Regression Line (line of best fit or least-squares line) The regression line is the graph of the regression equation

  9. Always Look at a Scatterplot First You should be able to “see” a straight line being passed through the data points.

  10. Regression Line Plotted on Scatterplot

  11. The Regression Line is calculated to minimize the distance of the line from the observed values.

  12. Notation for Regression Equation y-intercept of regression equation 0b0 Slope of regression equation 1b1 Equation of the regression line y = 0 + 1x y = b0 + b1x1 Population Parameter Sample Statistic ^

  13. Formulas for b0 and b1 Slope: y-intercept: NOTE: If you do not find b1 first, then b0 may be determined by:

  14. The Regression Line ^ y = b0 +b1x • Fits the sample points best. • Distances between this line and the sample points are at a minimum.

  15. When is it reasonable to do Regression Start by asking the following: Does it make sense to look at the relationship between these two variables? Does a scatter plot present a relationship (either positive or negative)? If yes to both, calculate r (the correlation). Is the correlation statistically significant? Yes - go on to regression No – best estimate becomes the mean of the y variable Conduct regression analysis (if yes above) Use the regression equation to calculate (estimate) a y-value given a specific x-value.

  16. Predictions In predicting a value of y based on some given value of x ... 1. If there is not a significant linear correlation, the best predicted y-value is y. 2. If there is a significant linear correlation, the best predicted y-value is found by substituting the x-value into the regression equation.

  17. Calculate the value of r and test the hypothesis that  = 0 Use the regression equation to make predictions. Substitute the given value in the regression equation. Is there a significant linear correlation ? Given any value of one variable, the best predicted value of the other variable is its sample mean. Start Yes Predicting the Value of a Variable No

  18. Guidelines for Using The Regression Equation • If there is no significant linear correlation, don’t use the regression equation to make predictions. • When using the regression equation for predictions, stay within the scope of the available sample data. • A regression equation based on old data is not necessarily valid now. • Don’t make predictions about a population that is different from the population from which the sample data was drawn.

  19. Definitions • Marginal Change the amount a variable changes when the other variable changes by exactly one unit • Outlier a point lying far away from the other data points • Influential Points points which strongly affect the graph of the regression line

  20. Residuals and the Least-Squares Property Definitions • Residual For a sample of paired (x,y) data, the difference (y - y) between an observed sample y-value and the value of y-hat, which is the value of y that is predicted by using the regression equation. • Least-Squares Property A straight line satisfies this property if the sum of the squares of the residuals is the smallest sum possible. ^

  21. x 1 2 4 5 ^ y= 5 + 4x y 4 24 8 32 y • Residual = 7 32 30 28 26 • Residual = 11 24 22 20 18 16 14 12 10 • 8 Residual = -13 6 • Residual = -5 4 2 x 0 1 2 3 4 5 Residuals and the Least-Squares Property

  22. Example : Orion Cars • Orion Cars: The age and price for a sample of 11 Orions are noted below. Calculate a correlation coefficient and , if appropriate, a regression equation for the relationship. Determine the value of cars that are 4.5 years and 10 years old. • CarAge (yrs.)Price ($100’s) • 1 5 85 • 2 4 103 • 3 6 70 • 4 5 82 • 5 5 89 • 6 5 98 • 7 6 66 • 8 6 95 • 9 2 169 • 10 7 70 • 11 7 48

  23. Example : Orion Cars

  24. Example : Orion Cars

  25. Example : Orion Cars (Price in thousands)

  26. Example : Orion Cars (Price in thousands)

  27. Example : Orion Cars (Price in thousands)

  28. With influential point Without influential point (Price in thousands)

More Related