1 / 38

Logistic regression analysis

Logistic regression analysis. Martin van der Esch, PhD. Discovering statistics using SPSS Andy Field http://www.youtube.com/watch?v=OvQShzJ7Sns (part 1) http://www.youtube.com/watch?v=zdJhydkcqv4 (part 2) http://www.youtube.com/watch?v=hxcDOoupB4Y (part 3) etc.

ameyj
Download Presentation

Logistic regression analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Logistic regression analysis • Martin van der Esch, PhD

  2. Discovering statistics using SPSS Andy Field • http://www.youtube.com/watch?v=OvQShzJ7Sns (part 1) • http://www.youtube.com/watch?v=zdJhydkcqv4 (part 2) • http://www.youtube.com/watch?v=hxcDOoupB4Y (part 3) • etc

  3. Logistic regression analysis • The basic principle of logistic regression is much the same as in linear regression analysis • Aim is to predict a transformation of the dichotomized dependent variable •  logit transformation

  4. Step 1: simple linear regression equation for binary dependent variable: Step 2: formulate estimated probability of Y: Step 3: in logistic regression we use odds ratio for estimated probability: Steps to follow

  5. Step 4: in case of skewed data (right sided):Logit transformation , makes log odds. Step 5: Different ways of presentation: estimated probability of p can be calculated from combination of variables Steps to follow 2

  6. Binary instead of continuous outcome We are interested in a binary outcome measure For example; Heart attack Y = 0 (“no”) Y = 1 (“yes”)

  7. … and we want But, how do we get there…?

  8. Analysing a binary variable (Y) as if it was a continuous variable  Not possible, because Y (heart attack) is no or yes (0 or 1)

  9. Number of heart attacks in different age groups

  10. Heart attack 1 0,8 0,6 0,4 0,2 0 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 age Possible… Relation between age and probable heart attack; p(y=1) Relation between age and probable heart attack; p(y=1)

  11. use of logistic model • NO modelling of the dichotomous outcome event itself • model probability of the outcome event given a set of prognostic factors • probability (D=1 | X1,X2,…,Xn) • probability (death | man, 80 yrs, with hypertension, normal cholesterol level)

  12. Outcome becomes estimated probability of outcome

  13. Estimated probability of outcome But, distribution of probability is skewed…

  14. Logit(p) of outcome Logit transformation of proportion to remove skeweness!

  15. Logit(p) of outcome a probability can be transformed into a number between minus infinity and infinity in two step • obtain the odds (2 out of 5 is sick: odds = 2/3) • take the natural logarithm The natural logarithm is the logaritm with the basic value e (e=2,71828…): 'elog' of 'ln'

  16. Model • the ln(odds) of an event is modelled • the model is similar to the linear regression model

  17. Model • It is far more easier to model along the whole number line, as in linear regression • from minus infinity to infinity • a probability is defined as being between 0 and 1

  18. Solution: logit transformation (is linear in x) Logit

  19. Outcome = natural logit of the odds on the outcome Model for Logistic Regression

  20. Summary • Rewrite the outcome as a probility on the outcome • 2. Logit transformation: rewrite the outcome as a Ln(odds)

  21. Model voor Logistic regression β’s (beta’s) estimated with Maximum Likelihood procedure

  22. Logistic regression analysis • ‘Best’ line is calculated with ‘maximum likelihood procedure’ • Maximum likelihood: obtained by several repeated cycles of calculation

  23. Example: Recovery

  24. Example: Binary outcome (heart attack) and one binary predictor (smoking)

  25. Ln(odds)infarct = -0.171 + 0,8 x Roken What is ß0 ? ß0 = ln(odds)heartattack non-smoker oddsheartattack non-smoker = EXP(ß0)

  26. Ln(odds)heartattack = -171 + 0,8 x Roken ln(odds)smoking - ln(odds)non-smoking = ß0 + ß1 - ß0= ß1 ln[(odds)smoking/(odds)non-smoking]= ß1 ln (OR) = ß1 OR = EXP(ß1) = EXP(0,8) = 2,23 Interpretation?

  27. Hypothesis testing: statistical difference between smokers and non-smokers • Wald toets • 95% CI of Odds Ratio • Likelihood-ratio-test (see M2-HC7 diagnosis)

  28. Wald toets = (b/SE(b))2 Chi-square divided with one degree of freedom (0.7997 / 0.2454)2 = 10.6231

  29. Example: Binary outcome (heart attack) and one binary predictor (smoking)

  30. Testing the regression coefficient • Likelihoodratiotest: • -2log likelihood of the model with the determinant in comparison with the -2log likelihood of the model without the determinant • Difference is chi-square divided • The amount of df is the same as the difference between the variables between both models

  31. Logistic regression with categorical predictor Analysis of three groups

  32. Frequence of ‘recovery’ • recovery recovery • group yes no • medication1 35 65 • medication2 40 60 • placebo 20 80

  33. What to do? • We analyse both medicationgroups with the placebogroup with dummy-variables

  34. We are also able to analyse the relationship between continuous variable and binary outcome with logistic regression analysis

  35. Logistic regression analysic with a continuous variable • Relation between age and pain(no/yes) • Effect size is odds ratio for the change of one unit of the determinant

  36. Linearity check • Similar with linear regression analysis • No scatter plot, but histogram: • Adding a quadratic term and splitting exposure variable into groups. • Be careful: do not use OR, but !

  37. Questions ?

More Related