200 likes | 210 Views
Conditional Stereotype Logistic Regression A new estimation command. Rob Woodruff Battelle Memorial Institute, Health & Analytics Email: woodruffr@battelle.org Cynthia Ferre Centers for Disease Control and Prevention. Overview. What is it? - Stereotype Logistic Regression
E N D
Conditional Stereotype Logistic RegressionA new estimation command Rob Woodruff Battelle Memorial Institute, Health & Analytics Email: woodruffr@battelle.org Cynthia Ferre Centers for Disease Control and Prevention
Overview • What is it? - Stereotype Logistic Regression - Conditional on what? • What‘s it good for? • Syntax and Examples
Constrained Multinomial Logistic Regression • Multinomial Model -Categorical Outcome Variable -Vector of Explanatory Variables -Related through the m logits:
Constrained Multinomial (continued) -The stereotype model imposes the constraints: Note: The phi’s are scalar quantities
It’s all about the phi’s • Full multinomial has m(p+1) parameters • Stereotype model has m-1 + m + p = 2m-1+p • The phi parameters give a way to quantify ordinality of the outcome variable. If • Then we have evidence of ordinal effect. • Also allow tests of distinguishability of outcome categories
So what’s the condition? • The multinomial and stereotype logistic regression models are implemented in Stata by mlogitandslogit • Assume independence of observations, not true for matched case-control data • For matched case control study, only independence of matched groups (strata, panels, clusters, etc) • For 1:M matching, condition on stratum total for outcome variable and focus instead on conditional likelihood Do I have to? Why condition on this particular event?
CSTEREO cstereo command Basic syntax: . cstereodepvarindepvars [if] [in], group(varname) [options]
Example with Real Data:Preterm Birth and Vitamin D • 1:2 (some 1:1) Pooled, Matched Case-Control Study of 2,583 Mothers in 870 matched groups • A case defined as gestational age at delivery of <37 weeks outcome4=3 (<32 weeks), outcome4=2, (32-35 weeks), outcome4=1 (36 weeks) and outcome4=0 (control: 37+ weeks) • Primary exposure variable of interest: Vitamin D levels, ohd25_total: blood serum concentration of (25)OHD in ng/ml • Sample of other covariates measured: edu = 0/1 indicator of post-high school education vitamin = 0/1 indicator of vitamin use during pregnancy
Interpretation of cstereo output: • Estimated beta coefficient of ohd25_total = -0.0074 with 95% confidence interval (-0.0358, 0.0210) • Odds ratio of being in <32 weeks gestational age compared to control is exp(-0.0074) = 0.993 (0.965, 1.021) • Now for odds ratios for the 32-35 weeks and 36 week case categories, we need the products of the parameters: • For standard errors, use Delta Method via nlcom
Interpretation continued: Exponentiating gives the odds ratio of being in the 32-35 weeks case category compare to controls of 0.994 with a 95% C.I. of (0.983, 1.004)
Constraints: • Are the 36 week and 32-35 weeks case categories distinguishable?
Constraint Output • The log-likelihood from the constrained model is -841.145 compared to -841.139 for the unconstrained stereotype model • Difference of 0.006 gives a chi2 value of 0.012 on 1 degree of freedom • P-value = 0.91 • Unconstrained stereotype model does not fit significantly better than the constrained and the two case categories are indistinguishable
Relationship to Other Models for Ordered/Categorical Outcomes • Constrained Multinomial • Not as parsimonious as the proportional odds model (ologit) but not valid in outcome dependent sampling • Adjacent category model is (basically) a constrained stereotype model. Also valid under outcome dependent sampling
Limitations • Convergence Issues • Currently only a one dimensional stereotype model • Cannot currently force an ordering on the stereotype parameters • Additional dependence structure
References: • Ferre C, et al; Maternal 25-Hydroxyvitamin D Status and the Risk of Preterm Delivery: A Multi-Center Nested Case Control Study; preprint • Mukherjee B, Liu I, Sinha S; Analysis of matched case-control data with multiple ordered disease states; Statistics in Medicine 2007 • Ahn J et. al.; Missing Exposure Date in Stereotype Regression Model; Biometrics 2011 • Andersen EB; Asymptotic Properties of Conditional Maximum-Likelihood Estimators; Journal of the Royal Statistical Society 1970 • Liang KY, Stewart WF; Polychotomous Logistic Regression Methods for Matched Case-Control Studies with Multiple Case or Control Groups; American Journal of Epidemiology 1987 • Scott AJ, Wild CJ; Fitting Regression Models to Case-Contro Data by Maximum Likelihood; Biometrika 1997 • Anderson JA; Regression and Ordered Categorical Variable; Journal of the Royal Statistical Society 1984\ • Greenland S; Alternative Models for Ordinal Logistic Regression; Statistics in Medicine 1994