1 / 36

Logistic regression for binary response variables

Logistic regression for binary response variables. Space shuttle example. n = 24 space shuttle launches prior to Challenger disaster on January 27, 1986 Response y is an indicator variable y = 1 if O-ring failures during launch y = 0 if no O-ring failures during launch

Download Presentation

Logistic regression for binary response variables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Logistic regression for binary response variables

  2. Space shuttle example • n = 24 space shuttle launches prior to Challenger disaster on January 27, 1986 • Response y is an indicator variable • y = 1 if O-ring failures during launch • y = 0 if no O-ring failures during launch • Predictor x1 is launch temperature, in degrees Fahrenheit

  3. Space shuttle example

  4. A model

  5. If there are 20% smokers and 80% non-smokers, and Yi= 1, if smoker and 0, if non-smoker, then: If pi = P (Yi= 1) and 1 – pi = P (Yi = 0), then: The mean of a binary response

  6. Then, the mean response … … is the probability that Yi = 1 when the level of the predictor variable is xi. A linear regression model for a binary response If the simple linear regression model is: for Yi = 0, 1

  7. Space shuttle example

  8. (Simple) logistic regression function

  9. Space shuttle example

  10. Alternative formulation of (simple) logistic regression function (algebra) “logit”

  11. Space shuttle example

  12. Interpretation of slope coefficients

  13. If pi = P (Yi= 1) and 1 – pi = P (Yi = 0), then: and Odds If there are 20% smokers and 80% non-smokers: and “Odds are 4 to 1” … 4 non-smokers to 1 smoker.

  14. Odds ratio MALE: 20% smokers and 80% non-smokers: FEMALE: 40% smokers and 60% non-smokers: The odds that a male is a nonsmoker is 2.67 times the odds that a female is a nonsmoker.

  15. Odds ratio Group 2 Group 1 The odds ratio

  16. Predicted odds at x1 = 55 degrees: Predicted odds at x1 = 80 degrees: Space shuttle example Predicted odds:

  17. Space shuttle example Predicted odds ratio for x1 = 55 relative to x1 = 80: The odds of O-ring failure at 55 degrees Fahrenheit is 76 times the odds of O-ring failure at 80 degrees Fahrenheit!

  18. Interpretation of slope coefficients The ratio of the odds at X1 = A relative to the odds at X1 = B (for fixed values of other X’s) is:

  19. Estimation of logistic regression coefficients

  20. Maximum likelihood estimation • Choose as estimates of the parameters the values that assign the highest probability to (“maximize likelihood of”) the observed outcome.

  21. For first observation, Y1 = 1 and x1 = 53: … for second observation, Y2 = 1 and x2 = 56: … and for 24th observation, Y24 = 0 and x24 = 81: Suppose

  22. The likelihood of the observed outcome is: The log likelihood of the observed outcome is: If α = 10 and β = -0.15, what is the probability of observed outcome?

  23. Maximum likelihood estimation • Choose as estimates of the parameters the values that assign the highest probability to (“maximize likelihood of”) the observed outcome.

  24. For first observation, Y1 = 1 and x1 = 53: … for second observation, Y2 = 1 and x2 = 56: … and for 24th observation, Y24 = 0 and x24 = 81: Suppose

  25. The likelihood of the observed outcome is: The log likelihood of the observed outcome is: If α = 10.8 and β = -0.17, what is the probability of observed outcome?

  26. Space shuttle example Link Function: Logit Response Information Variable Value Count failure 1 7 (Event) 0 17 Total 24 Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant 10.875 5.703 1.91 0.057 temp -0.17132 0.08344 -2.05 0.040 0.84 0.72 0.99

  27. Properties of MLEs • If a model is correct and the sample size is large enough: • MLEs are essentially unbiased. • Formulas exist for estimating the standard errors of the estimators. • The estimators are about as precise as any nearly unbiased estimators. • MLEs are approximately normally distributed.

  28. Test and confidence intervals for single coefficients

  29. Confidence interval: Inference for βj follows approximate standard normal distribution. Test statistic:

  30. Space shuttle example Link Function: Logit Response Information Variable Value Count failure 1 7 (Event) 0 17 Total 24 Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant 10.875 5.703 1.91 0.057 temp -0.17132 0.08344 -2.05 0.040 0.84 0.72 0.99

  31. Space shuttle example • There is sufficient evidence, at the α = 0.05 level, to conclude that temperature is related to the probability of O-ring failure. • For every 1-degree increase in temperature, the odds ratio of O-ring failure to O-ring non-failure is estimated to be 0.84 (95% CI is 0.72 to 0.99).

  32. Survival in the Donner Party • In 1846, Donner and Reed families traveled from Illinois to California by covered wagon. • Group became stranded in eastern Sierra Nevada mountains when hit by heavy snow. • 40 of 87 members died from famine and exposure. • Are females better able to withstand harsh conditions than are males?

  33. Survival in the Donner Party

  34. Survival in the Donner Party Link Function: Logit Response Information Variable Value Count STATUS SURVIVED 20 (Event) DIED 25 Total 45 Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant 1.633 1.110 1.47 0.141 AGE -0.07820 0.03729 -2.10 0.036 0.92 0.86 0.99 Gender 1.5973 0.7555 2.11 0.034 4.94 1.12 21.72

More Related