1 / 21

Linear Regression

Linear Regression. The Least squares Regression model. Regression Line. A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. We often use regression to predict the value of y given an x value. Equation of a Regression Line.

carlyn
Download Presentation

Linear Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Regression The Least squares Regression model

  2. Regression Line A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. We often use regression to predict the value of y given an x value

  3. Equation of a Regression Line • A regression line relating x to y has an equation: • yˆ (read y hat) is the predictor value of the response variable y for a given value of the explanatory variable x. • b is the slope, the amount y is expected to change when x increases one unit • a is the y intercept the predicted value of y when x=0

  4. Prediction • Interpolation is the use of a regression line to predict between known observations • Extrapolation is the use of a regression line to predict outside known observations • predictions from extrapolation are often not accurate

  5. Residuals • A residual is the difference between an observed value of the response variable and the value predicted by the regression line • Residual=observed y- predicted y • Residual =y-yˆ

  6. Least Square Regression line • The least squares regression line of y on x is the line that makes the sum of the squared residuals as small as possible. • Equation:

  7. Other Caclulations

  8. How well does a line fit the data? Since residuals tell us how far the data is from the regression line they are a natural place to look for the fit. A residual plot is a scatterplot of the residuals against the explanatory variable.

  9. How do residual plots help us assess the fit of the data? • A residual plot in effect turns the regression line horizontal • The Residual plots magnify the deviation of points from the line • Making it easier to see unusual observations and patterns

  10. What we look for in residual plots • NO obvious pattern • A curved pattern shows a nonlinear relationship. • A megaphone pattern shows growth of residuals • The residuals should be relatively small • The typical prediction error.

  11. The average prediction error The standard deviation of residuals (s)

  12. Home example • We want to predict the price of a home in Arvada. A random sample of 10 homes for sell is taken. Thousand • Make a prediction for the cost of the 11th house if we know the square footage is 1789 ft2

  13. Well here is what I would do • I would make a scatter plot. • Than I would find the linear regression • Finally, I would use the regression line to predict the cost. • Here’s what I found: • y=231.67+.33x • r=.87 Thus the price of the home • r2=.76 would be $353.47 thousand

  14. D what is the r2 thing? • r2 is the coefficient of determination. • Yes I know it is r squared, but why do we bother?

  15. More house example Now I am going to change one small thing in our house example, we don’t know the size of the 11th house. What would to predict the price to be now? I would predict the price to be $339.8 Not as good as our last prediction but not bad.

  16. Explain v Unexplained variability • We would expect our linear regression model to predict the price better than the mean, but is it really that much different? • The sum of squares prediction errors if we use the mean is 70913.6 • This is the sum of squares of TOTAL variation SST • The sum of squares residuals is 16754.6 • This is the sum of square of the ERROR SSE

  17. How SST and SSE make r2 • The ratio SSE/SST tells us how the proportion of variation in y still remaining. • SSE/SST=16754.6/70913.6 = .236 • Thus 23.6 % of the variation is unaccounted for in our model • Thus the percentage accounted for in our model is 1-.236= .764

  18. HOLD ON Wasn’t r2=.76? Yes, it was. In fact we can calculate r2 by finding the ratio SSE/SST and subtracting it from 1. Thus, what r2 tells us is the amount of variability explained by the model

  19. So finally

More Related