1 / 17

Regression

Regression. Regression. Correlation measures the strength of the linear relationship Great! But what is that relationship? How do we describe it? regression, regression line, regression equation Regression line is used for prediction. Predicting weights from heights.

minh
Download Presentation

Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regression

  2. Regression • Correlation measures the strength of the linear relationship • Great! But what is that relationship? How do we describe it? • regression, regression line, regression equation • Regression line is used for prediction

  3. Predicting weights from heights • Independent variable: height • Dependent variable: weight • How can we predict one from the other ? • Regression is to a scatter plot as the mean is to a histogram.

  4. Weights vs. Heights

  5. 70000 60000 50000 40000 30000 SALARY 20000 -5 0 5 10 15 20 25 30 YRS EM Salary by years employed

  6. Regression by local averages Approximation of Local averages by regression line Inappropriate use of regression line (use other methods)

  7. The equation of a line • a represents the y-intercept • when x equals zero, y equals a • Is this always meaningful in the context of a problem? • Is it always useful in defining a line? • b represents the slope of the line (rise/run) • for every unit change in x, y changes by b. • Does this mean that if we physically change x by one unit, y will change by b units? Say we gain another year of experience. Will our salary go up by 1107?

  8. Regression equation • What is the predicted weight of somebody whose height is h cm ? • w = intercept + slope x h • This is known as the regression equation. • How do we get this formula ? • We have a statistical model

  9. A residual Regression line by minimising residual errors • ei = error of i-th obs from • regression line • The best candidate line will • minimise these errors • No line can make all errors vanish (some +ve, some –ve)

  10. Regression and correlation • Want to predict weight for those people who are 1 SD more than avg. height. • SD line says: • pred. wt. = overall avg. wt. + SD of wt. • Regression line says: • Predicted wt. = overall avg. wt. + r x SD of wt. • For people who are k SDs away from avg. height: • Predicted wt. = overall avg. wt. + r x kSD of wt. • Clearly valid for r  0 or r  1

  11. RMS error of regression • RMS error = SD of y • RMS inversely related to correlation RMS error is to regression what SD is to average

  12. Residuals residual = observed -predicted

  13. Example: ozone vs. temperature > air[,c(1,3)] ozone temperature 3.4567 3.30 72 2.2974 2.62 62 2.84 65 . . . > cor(ozone,temperature) [1] 0.7531038

  14. Fitting a regression model in S > ozone.lm <- lm(ozone ~ temperature, data = air) Coefficients: . Value Std. Error tvalue Pr(>|t|) (Intercept) -2.230.46 -4.820.0000 temperature 0.070.0111.95 0.0000 Multiple R-Squared: 0.5672 > var(ozone) [1] 0.7928069 > var(resid(ozone.lm)) [1] 0.3431544 > cor(ozone,temperature) [1] 0.7531038

  15. Checking model appropriateness What assumptions have we made in the regression model ? Checking model assumptions in S-plus > par(mfrow=c(2,3)) > plot(ozone.lm)

  16. Residual diagnostics for ozone data

  17. Extrapolation Beware of extrapolation Pizza party at the Frat. • How many laps would you predict a pledge could run if he ate 6 slices of pizza? • How many laps if he ate 9 slices of pizza? • A pledge shows off and eats 35 slices of pizza. How many laps would you predict he would run?

More Related