1 / 21

Everyday is a new beginning in life. Every moment is a time for self vigilance.

Everyday is a new beginning in life. Every moment is a time for self vigilance. . Simple Linear Regression. Regression model Goodness of fit Model diagnosis. Goal: to predict the length of Armspan for a given Height. Humm… How long is my armspan?. Armspan Data. HEIGHT ARMSPAN

abrial
Download Presentation

Everyday is a new beginning in life. Every moment is a time for self vigilance.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Everyday is a new beginning in life. Every moment is a time for self vigilance.

  2. Simple Linear Regression Regression model Goodness of fit Model diagnosis

  3. Goal: to predict the length of Armspan for a given Height • Humm… • How long is my armspan?

  4. Armspan Data HEIGHT ARMSPAN 68.75 64.25 75.75 70.25 45.75 43.00 66.75 66.25 66.50 66.75 72.25 71.25 48.25 47.25 … 75.50 70.00 75.00 77.25 64.00 65.25 68.50 67.50

  5. Review: Math Equation for a Line • Y: the response variable • X: the explanatory variable Y=a+bX Y } b 1 }a X

  6. Regression Model • The regression line models the relationship between X and Y on average. • Population regression line • Least squared regression line • The math equation of a regression line is called regression equation.

  7. The Predicted Y Value • We use the regression line to estimate the average Y value for a specified X value and use this Y value to predict what Y value we might observe at this X value in the near future. • This predicted Y value, denoted as and pronounced as “y hat,” is the Y value on the regression line. So, Regression equation

  8. The Usage of Regression Equation • Predict the value of Y for a given X value Eg. Wish to predict a lady’s weight by her height. ** What is X? Y? ** Suppose a, b are estimated as -205 and 5: ** For ladies with HT of 60”, their WT will be predicted as -205+5x60=95 pounds, the (estimated) average WT of all ladies with HT of 60’’.

  9. Examples of the Predicted Y • The predicted WT of a given HT • The predicted armspan of a given height

  10. The Limitation of the Regression Equation • The regression equation cannot be used to predict Y value for the X values which are (far) beyond the range in which data are observed. Eg. Given HT of 40”, the regression equation will give us WT of -205+5x40 = -5 pounds!!

  11. The Unpredicted Part • The value is the part the regression equation (model) cannot catch, and it is called “residual,” denoted as e, an estimate of “error” at this observation

  12. residual {

  13. Least Square Method • The regression line is the line which minimizes the sum of squares of residuals (SSE) and so the formulas for intercept and slope on the regression line are:

  14. Inference for Regression Slope b • Standard error of • Confidence interval • Hypothesis test

  15. Goodness of Fit • For each observation: residuals • For the whole data set: the coefficient of determination R2, which measures the proportion of variability in Y explained by the model (the linear regression of Y on X); • For simple linear regression (only one predictor) R2 = r2

  16. Model Assumptions and Diagnosis • Independent observations • Y|X=x follows a normal distribution with a common standard deviation s, independent of x value • Diagnosis: Residual Plot, residual vs. fitted value

  17. Residual Plot: Is the spread level of residuals more or less the same over fitted value?

  18. Minitab:Stat>>Regression>> regression … • Select the response and predictors accordingly • Click “graphs” for residual plots

  19. Residual Plots Click “residuals versus fits”

  20. Regression Analysis: ARMSPAN versus HEIGHT The regression equation is ARMSPAN = - 3.73 + 1.04 HEIGHT Predictor Coef SE Coef T P Constant -3.728 2.660 -1.40 0.169 HEIGHT 1.03655 0.04082 25.39 0.000 S = 2.12905 R-Sq = 94.4% R-Sq(adj) = 94.3% Analysis of Variance Source DF SS MS F P Regression 1 2922.8 2922.8 644.81 0.000 Residual Error 38 172.2 4.5 Total 39 3095.1 Minitab Output

More Related