180 likes | 200 Views
This study explores the application of logistic regression in case-control studies, focusing on the odds of exposure and disease. References and methodology are discussed to underline the importance and implications of modeling exposure accurately. The text critically examines the use of continuous independent variables and the varying outcomes they can produce. Concluding with insights on modeling exposures individually for a more comprehensive epidemiological approach.
E N D
Using Logistic Regression In Case Control Studies Department of Community Health Sciences: September 27,2002
No statistics should stand in the way of an experimenter keeping his eyes open, his mind flexible, and on the lookout for surprises. (William Feller)
Background: • Quan H., Arboleda-Florez J., Fick G.H., Stuart H.L., Love E.J. (2002) Association Between Physical Illness and Suicide Among The Elderly. Social Psychiatry and Psychiatric Epidemiology, 37:190-197 • David Adler, Nimira Kanji, Kiril Trpkov, Gordon Fick, Rhiannon M. HughesHPC2/ELAC2 Gene Variants Associated with Prostate Cancer (in submission) • The class of MDSC643.02 in the Winter Term 2002
Case Control Studies • Investigator selects cases and controls • Investigator determines exposure • Primary outcome measure: Odds of exposure (yes/no) • The ‘Magic’ Odds Ratio (OR)
Case Control Studies • Two by Two tables • Classical Stratified Analysis (SA) • Stratum specific odds ratios • Crude odds ratio • Mantel Haentzel odds ratio • EASY…. Right?
A Definition of the Chi-Square test: A procedure any fool can carry out and frequently does.(SJ Penn)
Logistic Regression • 1) Model the log of the odds of exposure • OR • 2) Model the log of the odds of disease • Does it matter? MOST of the time. • Standard Likelihood theory gives us blessing for option 1)
It does not matter - • IF the model is equivalent to a stratified analysis, • …. then some of the coefficients from LR will the same as the log(OR) values from SA • …. not all the coefficients will be the same though…
Results will differ - • In ALL other situations (at least a little…) • BUT there are those solid papers in the literature that appear to say “it’s OKAY” to model the odds of disease • AND the textbooks and standard references appear to give a ‘green light’ as well
References • Prentice RL and Pyke R (1979) Biometrika 66:3 403-411 • [after some impenetrable mathematics] • “….is precisely the distributional statement that would arise if [a model for the odds of disease] were directly applied to the case-control data” • BUT… BUT… Aren’t we all frequentists?
The books: • Kleinbaum, Kupper, Morgenstern • Rothman and Greenland • Rosner • Matthews and Farewell • They ALL note the estimates are OKAY • They are ALL silent on the sampling distribution. • BUT what about the standard errors? P-values?
It does not follow that if quantitative methods be indiscriminately applied to inexhaustible quantities of data, scientific understanding will necessarily emerge. (M.K. Hubbert)
An exercise that makes no provision for the definition and estimation of error cannot properly be called an experiment. (D.B. DeLury)
Exposures may not be dichotomous • If exposure is ‘measured’, then the model for exposure could be linear regression • There is no ‘obvious’ magical odds ratio now • BUT it is still SO SO tempting to just model the log of odds of disease with a *continuous* independent variable (exposure)
The modelling process - • Can lead us in very different ways to very different models and very different conclusions: • QUAN Hude et al • et al and Rhiannon Hughes
What about the Gate Keepers? • Editors and Associate Editors • Epidemiologists • Biostatisticians
Conclusions • I am taking yet another poke at the much maligned case-control study • Epidemiological issues still dominate the challenges of designing and using case-control studies • It remains safe to model the exposure(s) individually as dependent variable(s) (if we trust the standard likelihood theory)
SJ Penn again A definition of Power: A probability of a possible outcome of a potential decision conditional upon an imaginable circumstance given a conceivable value of an algebraic embodiment of an abstract mathematical idea and the strict adherence to an extremely precise rule.