1 / 63

Linear Models III Thursday May 31, 10:15-12:00

Linear Models III Thursday May 31, 10:15-12:00. Deborah Rosenberg, PhD Research Associate Professor Division of Epidemiology and Biostatistics University of IL School of Public Health Training Course in MCH Epidemiology. Ordinal and Nominal Outcomes. Outcomes with More than 2 Categories

reia
Download Presentation

Linear Models III Thursday May 31, 10:15-12:00

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Models IIIThursday May 31, 10:15-12:00 Deborah Rosenberg, PhD Research Associate Professor Division of Epidemiology and Biostatistics University of IL School of Public Health Training Course in MCH Epidemiology

  2. Ordinal and Nominal Outcomes • Outcomes with More than 2 Categories • Examples of Outcomes which might be suited for ordinal or nominal regression: • Ordinal or Nominal bmi categories • Nominal cause of death categories • Ordinal or nominal severity of illness categories • Ordinal or nominal categories of program participation

  3. Ordinal and Nominal Outcomes • The Cumulative Logit Model • The primary motivation for using a logistic model with an ordinal outcome is to accommodate a truly ordinal variable that has a "ceiling" and "floor" effect and one in which the intervals between each response category can be somewhat arbitrary —that is, it is not a continuous variable. • Modeling an ordinal outcome as a continuous variable can yield biased results because it will yield predicted values outside the range of the ordinal variable.

  4. Ordinal and Nominal Outcomes • The Cumulative Logit Model • An ordered outcome may reflect an underlying continuous variable for which we have no data or for which we don't know the "real" threshold values. • For example, a Likert scale for satisfaction—very dissatisfied to very satisfied—or for agreement—strongly disagree to strongly agree—has response categories reflecting a continuous scale for which there is no data.

  5. Modeling Ordinal Outcomes Some other ordinal variables that may reflect an underlying continuous construct that cannot be measured as such. The ordered values are intended to reflect distinct threshold values. Examples of ordinal variables of this type: • access to care index • reports of experience of life stress • assessment of overall health status • satisfaction with care 4

  6. Ordinal and Nominal Outcomes • The Cumulative Logit Model • To appropriately model an outcome as ordinal, the proportional odds assumption must hold. • The proportional odds assumption: • if an independent variable increases (or decreases) the odds of being in category 1 v. the remaining categories, then it also similarly increases (or decreases) the odds of being in category 2 and 1 combined v. the remaining categories, in categories 3, 2, and 1 combined v. the remaining categories, etc.

  7. Ordinal and Nominal Outcomes • The Cumulative Logit Model • The null hypothesis for the proportional odds assumption is that the odds ratios for the association between a risk factor and an ordinal outcome are constant regardless of how the category boundaries are drawn. • If the proportional odds assumption holds, then the association between an independent variable and the outcome can be expressed as a single summary estimate—a common odds ratio—across all categories.

  8. Ordinal and Nominal Outcomes • The Cumulative Logit Model • The proportional odds assumption can be tested with a chi-square statistic – a score test. A nonsignificant result means that the null hypothesis will not be rejected and that the cumulative logit model is appropriate; a significant result means that the proportional odds assumption may not hold.

  9. Ordinal and Nominal Outcomes • The Cumulative Logit Model: • For an ordered outcome with k categories • Both the numerator and denominator change • http://www.indiana.edu/%7Estatmath/stat/all/cat/2b1.html

  10. Ordinal and Nominal Outcomes • Odds Among the exposed = a / b+c+d • Odds Among the exposed = a+b / c+d • Odds Among the exposed = a+b+c / d

  11. Ordinal and Nominal Outcomes • The Cumulative Logit Model • Given k categories of an ordered outcome variable, a cumulative logit model yields k-1 intercept terms. Each intercept corresponds to a category combined with all adjacent lower-ordered categories. • Since proportional odds are assumed, and therefore a common odds ratio, the effect of each covariate is reflected in a single beta coefficient.

  12. Ordinal and Nominal Outcomes • The Cumulative Logit Model • Suppose an outcome variable has 4 categories and we are modeling one independent variable. The cumulative logit model will look as follows: • ln(Odds) = b0,1 + b0,12 + b0,123 + b1 • The odds ratio is the same regardless of category:

  13. Ordinal and Nominal Outcomes • A stratified approach to mimic a cumulative logit model for a 4 category variable, would mean creating new dichotomous variables something like the following: • if ordvar = 1 then ordvar1 = 1; • else if ordvar ^= . then ordvar1 = 0; • if 1<=ordvar<=2 then ordvar2 = 1; • else if ordvar ^= . then ordvar2 = 0; • if 1<=ordvar<=3 then ordvar3 = 1; • else if ordvar ^= . then ordvar3 = 0;

  14. Ordinal and Nominal Outcomes • Mimicking Cumulative Logit with Binary Logistic Models • proc logistic; The OR from each model • model ordvar1 = factors; will be approx. the same if • run; the proportional odds • proc logistic; assumption holds. • model ordvar2 = factors; • run; • proc logistic; Note that all observations • model ordvar3 = factors; are used in each model. • run;

  15. Ordinal and Nominal Outcomes • The Cumulative Logit Model • If the proportional odds assumption does not hold, it might be because the outcome variable is nominal rather than ordinal, or it might be that we have mis-specified the categories, failing to pinpoint important thresholds on the underlying continuum. • The score test is quite sensitive—it is up to the analyst to examine the pattern of ORs for different dichotomous cutpoints and decide whether it is reasonable to use a cumulative logit model.

  16. Ordinal and Nominal Outcomes • The Generalized Logit Model • In contrast to the cumulative logit model, in a generalized logit model, the outcome categories are like dummy variables—mutually exclusive categories compared to a common reference group.

  17. Ordinal and Nominal Outcomes • The Generalized Logit Model: • For a nominal outcome with k categories • Fixed denominator (reference category) • http://www.indiana.edu/%7Estatmath/stat/all/cat/2b1.html

  18. Ordinal and Nominal Outcomes • Odds Among the exposed = a / d • Odds Among the exposed = b / d • Odds Among the exposed = c / d

  19. Ordinal and Nominal Outcomes • The Generalized Logit Model • Given k categories of an outcome variable, a generalized logit model yields k-1 intercept terms. Each intercept corresponds to a single category. • Since proportional odds are not assumed, odds ratios can vary across categories, and therefore the effect of each covariate is reflected in k-1 slope parameters.

  20. Ordinal and Nominal Outcomes • The Generalized Logit Model • Suppose an outcome variable has 4 categories and we are modeling one independent variable. The generalized logit model is as follows: • ln(Odds) = b0,1 + b0,2 + b0,3 + b1,1 + b1,2 +b1,3 • 1. The odds ratios are • distinct for each category: • 2. 3.

  21. Ordinal and Nominal Outcomes The Generalized Logit Model Each slope parameter tests the odds of being in one outcome category compared to the odds of being in the reference category • Compared to those without Factor A, individuals with factor A have ___ times the odds of having the outcomecategory 1; • Compared to those without Factor A, individuals with factor A have ___ times the odds of having the outcomecategory 2; • Compared to those without Factor A, individuals with factor A have ___ times the odds of having the outcomecategory 3;

  22. Ordinal and Nominal Outcomes • A stratified approach to mimic generalized logit model for a 4 category variable, would not require creation of new variables, but would mean running models like the following:

  23. Ordinal and Nominal Outcomes • proc logistic; Mimicking Generalized Logit • where ordvar in(1,4); with Binary Logistic Models • model ordvar = factors; • run; • proc logistic; The ORs from the • where ordvar in(2,4); models will differ. • model ordvar = factors; • run; • proc logistic; Note that different • where ordvar in(3,4); subsets of observations • model ordvar = factors; are used in each model. • run;

  24. Example 1. • The Association of Smoking and Fetal/Infant Death • in Preterm Deliveries • Crude OR=1.07

  25. Example 1. • The Association of Smoking and Fetal/Infant Death in Preterm Deliveries • Crude Logistic Model with Dichotomous Outcome

  26. Example 1. • Cumulative Logit: Odds of type of death among smokers • and the OR for smoker v. nonsmoker • Odds=46 / (33+1135)=0.04 Odds=(46+33) / 1135=0.07 • OR = 1.04 OR = 1.07

  27. Example 1. • Cumulative Logit Model with 3 Categories • Ordered Value outcome5 Frequency • 1 fetal death >=20 wks 332 • 2 neonatal death 0-28 days 229 • 3 survivor >=28 days 8520 • Probabilities modeled are cumulated over the lower Ordered Values. • Score Test for the Proportional Odds Assumption • Chi-Square DF Pr > ChiSq The proportional • 0.0400 1 0.8414odds assumption • holds

  28. Example 1. • Cumulative Logit:Each intercept corresponds to a category plus all categories with lower ordered values v. the remaining categories. • The odds ratio is an ‘average’ of the cumulative logits • 46 / (33+1135) = e-3.2803+0.0635 = 0.04 • (46+33) / 1135 = e-2.7291+0.0635 = 0.07

  29. Example 1. • Generalized Logit Model with 3 Categories • In a generalized logit model, each intercept and slope correspond to a single category. • Is 1.07 a reasonable summary of 1.047 and 1.096?

  30. Example 2. • The Association of Maternal Risk and Fetal/Infant Death in Preterm Deliveries

  31. Example 2. • The Association of Maternal Risk and Fetal/Infant Death in Preterm Deliveries • Crude Logistic Model with Dichotomous Outcome

  32. Example 2. • Cumulative Logit Model with 3 Categories • Ordered Value outcome5 Frequency • 1 fetal death >=20 wks 418 • 2 neonatal death 0-28 days 261 • 3 survivor >=28 days 9549 • Probabilities modeled are cumulated over the lower Ordered Values. • Score Test for the Proportional Odds Assumption • Chi-Square DF Pr > ChiSq The proportional • 10.7077 1 0.0011odds assumption • does not hold.

  33. Example 2. • Cumulative Logit Model with 3 Categories • The odds ratio is an ‘average’ of the cumulative logits • e-3.1750+0.0473 = 0.04 • e-2.6629+0.0473 = 0.07

  34. Example 2. • Generalized Logit Model with 3 Categories • Is 1.048 a reasonable summary of 0.86 and 1.5?

  35. Example 3. LBW • Modeling a 3 category birthweight variable: • /*cumulative logit */ • proclogisticorder=formatted; • model bwcat = smoking late_no_pnc; • run;

  36. Example 3. LBW

  37. Example 3. LBW • /*mimicking cumulative logit with binary models*/ • proclogisticorder=formatted; • model vlbw = smoking late_no_pnc; • run; • vlbw v. • mlbw and normal • proclogisticorder=formatted; • model lbw = smoking late_no_pnc; • run; • vlbw and mlbw v. • normal • Both models include all observations in the sample

  38. Example 3. LBW • /* generalized logit */ • proclogisticorder=formatted; • model bwcat(ref='normal bw') = smoking late_no_pnc • / link=glogit; • run;

  39. Example 3. LBW • vlbw v. normal and mlbw v. normal

  40. Example 3. LBW • /* mimicking generalized logit with binary models*/ • proclogisticorder=formatted; • where bwcat = 2 or bwcat = 0; • model bwcat(ref='normal bw') = smoking late_no_pnc • / link=glogit; • run; • proclogisticorder=formatted; • where bwcat = 1 or bwcat = 0; • model bwcat(ref='normal bw') = smoking late_no_pnc • / link=glogit; • run;

  41. Example 3. LBW • Generalized logit approach using binary models with only a subset of observations in each model • vlbw v. • normal • mlbw v. • normal

  42. Example 3. LBW • Generalized logit models can get complicated, • but custom estimates can still be obtained in the usual way. • proclogisticorder=formatted; • where2<=momage<=3; • class parityrisk(ref='no hx preterm') / param=ref; • model bwcat = smoking late_no_pnc matrisk momage • parityrisk smoking*parityrisk / link=glogit; • contrast'sm-risk, hxpreterm' smoking 1 matrisk 1 • smoking*parityrisk 10 / estimate=exp; • contrast'sm-risk, primips'smoking 1 matrisk 1 • smoking*parityrisk 01 / estimate=exp; • contrast'sm-risk, lorisk multips' smoking 1 matrisk 1 • smoking*parityrisk 00 / estimate=exp; • run;

  43. Example 3. LBW • The tests for the constructs in the model are all statistically significant:

  44. Example 3. LBW • Not all beta coefficients are statistically significant.

  45. Example 3. LBW • Parity-specific contrasts of the joint effect of smoking and having some antepartum medical risk, adjusting for entry into prenatal care and maternal age. • Should we leave the smoking*parityrisk term in the model?

  46. Example 4. Prenatal Care Should we consider the categories ordinal or nominal?

  47. Example 4. Prenatal Care The Overlapping dichotomous Contrasts No Pnc v. Any PNC, OR = 3.2 Inad/No v. Adeq+/Adeq/Inter, OR=2.7 Inter/Inad/No v. Adeq+/Adeq, OR=1.8 All others v. Adeq+, OR=0.60

  48. Example 4. Prenatal Care Non-overlapping dichotomous contrasts:

  49. Example 4. Prenatal Care Cumulative Logit: The null hypothesis of proportional odds is rejected. Any association is obscured by averaging across levels of APNCU.

  50. Example 4. Prenatal Care • Generalized • Logit

More Related