390 likes | 542 Views
Winter Electives. Molecular and Genetic Epidemiology Decision and Cost-effectiveness Analysis Grantwriting for Career Development Awards (Workshop – not for credit hours) Medical Informatics. Next Tuesday (12/5/06) 8:15 to 9:45: Journal Club 10:00 to 1:00 pm: Mitch Katz
E N D
Winter Electives • Molecular and Genetic Epidemiology • Decision and Cost-effectiveness Analysis • Grantwriting for Career Development Awards (Workshop – not for credit hours) • Medical Informatics
Next Tuesday (12/5/06) • 8:15 to 9:45: Journal Club • 10:00 to 1:00 pm: Mitch Katz • Note chapters in his text book • Lunch provided • 1:30 to 2:45: Last Small Group Section • Web-based course evaluation • Bring laptop • Distribute Final Exam • Exam due 12/12 (in hands of Olivia by 4 pm) by email or China Basin 5700
Confounding and Interaction: Part III • When Evaluating Association Between an Exposure and an Outcome, the Possible Roles of a 3rd Variable are: • Intermediary Variable • Effect Modifier • Confounder • No Effect • Forming “Adjusted” Summary Estimates to Evaluate Presence of Confounding • Concept of weighted average • Woolf’s Method • Mantel-Haenszel Method • Clinical/biological decision rather than statistical • Handling more than one potential confounder • Limitations of Stratification to Adjust for Confounding • the motivation for multivariable regression
When Assessing the Association Between an Exposure and a Disease, What are the Possible Effects of a Third Variable? Assumption: The third variable a priori is felt to be relevant No Effect Intermediary Variable: ON CAUSAL PATHWAY I C + EM _ Effect Modifier (Interaction): MODIFIES THE EFFECT OF THE EXPOSURE Confounding: ANOTHER PATHWAY TO GET TO THE DISEASE D
What are the Possible Roles of a 3rd Variable? • Intermediary Variable • Effect Modifier (interaction) • Confounder • No Effect Intermediary Variable? (conceptual decision) Report Crude Estimate no yes Effect Modifier? (numerically assess both magnitude and statistical differences) Report stratum-specific estimates no yes Confounder? (numerically assess difference between adjusted and crude; not a statistical decision) yes Report “adjusted” summary estimate Report Crude Estimate (3rd variable has no effect) no
Effect of a Third Variable: Statistical Interaction Crude RR crude= 1.7 Stratified Heavy Caffeine Use No Caffeine Use RRcaffeine use = 0.7 RRnocaffeine use = 2.4 . cs delayed smoking, by(caffeine) caffeine | RR [95% Conf. Interval] M-H Weight -----------------+------------------------------------------------- no caffeine | 2.414614 1.42165 4.10112 5.486943 heavy caffeine | .70163 .3493615 1.409099 8.156069 -----------------+------------------------------------------------- Crude | 1.699096 1.114485 2.590369 M-H combined | 1.390557 .9246598 2.091201 -----------------+------------------------------------------------- Test of homogeneity (M-H) chi2(1) = 7.866 Pr>chi2 = 0.0050 Report interaction; confounding is not relevant
Statistical Tests of Interaction: Test of Homogeneity (heterogeneity) • Null hypothesis: The individual stratum-specific estimates of the measure of association differ only by random variation • i.e., the strength of association is homogeneous across all strata • i.e., there is no interaction • The test statistic will have a chi-square distribution with degrees of freedom of one less than the number of strata
Report vs Ignore Interaction?Some Guidelines Is an art form: requires consideration of both clinical and statistical significance
If Interaction is not Present, What Next? • Case-control study of post-exposure AZT use in preventing HIV seroconversion after needlestick (NEJM 1997) Crude ORcrude = 0.61 (95% CI: 0.26 - 1.4)
Post-exposure prophylaxis with AZT after a needlestick AZT Use Severity of Exposure HIV Confounding by Indication
Evaluating for Interaction • Potential confounder: severity of exposure Crude ORcrude =0.61 Stratified Minor Severity Major Severity OR = 0.0 OR = 0.35
Is there interaction? Is there confounding? What is the adjusted measure of association?
Assuming Interaction is not Present, Form a Summary of the Unconfounded Stratum-Specific Estimates • Construct a weighted average • Assign weights to the individual strata • Summary Estimate = Weighted Average of the stratum-specific estimates • a simple mean is a weighted average where the weights are equal to 1 • which weights to use depends on type of effect estimate desired (OR, RR, RD) and characteristics of the data • e.g., • Woolf’s method • Mantel-Haenszel method
Forming a Summary Estimate for Stratified Data Crude ORcrude = 0.61 Stratified Minor Severity Major Severity OR = 0.0 OR = 0.35 How would you weight these strata? According to sample size? No. of cases?
Summary Estimators: Woolf’s Method • aka Directly pooled or precision estimator • Woolf’s estimate for adjusted odds ratio • where wi • wi is the inverse of the variance of the stratum-specific log(odds ratio)
Calculating a Summary Effect Using the Woolf Estimator • e.g. AZT use, severity of needlestick, and HIV Crude ORcrude =0.61 Stratified Minor Severity Major Severity OR = 0.0 OR = 0.35
Summary Estimators: Woolf’s Method • Conceptually straightforward • Best when: • number of strata is small • sample size within each strata is large • Cannot be calculated when any cell in any stratum is zero because log(0) is undefined • 1/2 cell corrections have been suggested but are subject to bias • Formulae for Woolf’s summary estimates for other measures (e.g., RR, RD) available in texts and software documentation • sensitive to small strata, cells with “0” • computationally messy
Summary Estimators: Mantel-Haenszel • Mantel-Haenszel estimate for odds ratios • ORMH = • wi = • wi is inverse of the variance of the stratum-specific odds ratio under the null hypothesis (OR =1)
Summary Estimators: Mantel-Haenszel • Mantel-Haenszel estimate for odds ratios • “relatively” resistant to the effects of large numbers of strata with few observations • resistant to cells with a value of “0” • computationally easy • most commonly used
Calculating a Summary Effect Using the Mantel-Haenszel Estimator • ORMH = • ORMH = Crude ORcrude =0.61 Stratified Minor Severity Major Severity OR = 0.0 OR = 0.35
Calculating a Summary Effect in Stata • epitab command - Tables for epidemiologists • see “Survival Analysis and Epidemiological Tables Reference Manual” • To produce crude estimates and 2 x 2 tables: • For cross-sectional or cohort studies: • cs variablecase variable exposed • For case-control studies: • cc variablecase variableexposed • To stratify by a third variable: • cs varcase varexposed, by(varthird variable) • cc varcase varexposed, by(varthird variable) • Default summary estimator is Mantel-Haenszel • , pool will also produce Woolf’s method
Calculating a Summary Effect Using the Mantel-Haenszel Estimator • e.g. AZT use, severity of needlestick, and HIV • . cc HIV AZTuse,by(severity) pool • severity | OR [95% Conf. Interval] M-H Weight • -----------------+------------------------------------------------- • minor | 0 0 2.302373 1.070588 • major | .35 .1344565 .9144599 6.956522 • -----------------+------------------------------------------------- • Crude | .6074729 .2638181 1.401432 • Pooled (direct) | . . . • M-H combined | .30332 .1158571 .7941072 • -----------------+------------------------------------------------- • Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = 0.4400 • Test that combined OR = 1: • Mantel-Haenszel chi2(1) = 6.06 • Pr>chi2 = 0.0138 Crude ORcrude =0.61 Stratified Minor Severity Major Severity OR = 0.0 OR = 0.35
Calculating a Summary Effect Using the Mantel-Haenszel Estimator • In addition to the odds ratio, Mantel-Haenszel estimators are also available in Stata for: • risk ratio • “cs varcase varexposed, by(varthird variable)” • rate ratio • “ir varcase varexposed vartime, by(varthird variable)”
Assessment of Confounding: Interpretation of Summary Estimate • Compare “adjusted” estimate to crude estimate • e.g. compare ORMH (= 0.30) to ORcrude (= 0.61) • If “adjusted” measure “differs meaningfully” from crude estimate, then confounding is present • e.g., does ORMH = 0.30 “differ meaningfully” from ORcrude = 0.61? • What does “differs meaningfully” mean? • a matter of judgement based on biologic/clinical sense rather than on a statistical test • no one correct answer • the objective is to remove bias • 10% change from the crude often used • your threshold needs to be stated a priori and included in your methods section
Statistical Testing for Confounding is Inappropriate • Testing for statistically significant differences between crude and adjusted measures is inappropriate • e.g., when examining an association for which a factor is a known confounder (say age in the association between HTN and CAD) • if the study has a small sample size, even large differences between crude and adjusted measures may not be statistically different • yet, we know confounding is present • therefore, the difference between crude and adjusted measures cannot be ignored as merely chance. • bias must be prevented: the difference must be reported as confounding • the issue of confounding is one of internal validity, not of sampling error. • we must live with – within reason -- whatever effects we see after adjustment for a factor for which there is an a priori belief about confounding • we’re not concerned that sampling error is causing confounding and therefore we don’t have to worry about testing for role of chance
Confidence Interval Estimation and Hypothesis Testing for the Mantel-Haenszel Estimator • e.g. AZT use, severity of needlestick, and HIV • . cc HIV AZTuse,by(severity) pool • severity | OR [95% Conf. Interval] M-H Weight • -----------------+------------------------------------------------- • minor | 0 0 2.302373 1.070588 • major | .35 .1344565 .9144599 6.956522 • -----------------+------------------------------------------------- • Crude | .6074729 .2638181 1.401432 • Pooled (direct) | . . . M-H combined | .30332 .1158571 .7941072 • -----------------+------------------------------------------------- • Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = 0.4400 • Test that combined OR = 1: • Mantel-Haenszel chi2(1) = 6.06 • Pr>chi2 = 0.0138 • What does the p value = 0.0138 mean?
Mantel-Haenszel Techniques • Mantel-Haenszel estimators • Mantel-Haenszel chi-square statistic • Mantel’s test for trend (dose-response)
Summary Effect in Stata – another example • e.g. Spermicide use, maternal age and Down’s Crude OR = 3.5 Age < 35 Age > 35 Stratified OR = 3.4 OR = 5.7 Is there confounding present? Which answer should you report as “final”?
No Effect of Third Variable Crude ORcrude = 21.0 (95% CI: 16.4 - 26.9) Stratified Matches Present Matches Absent ORmatches = 21.0 OR nomatches = 21.0 ORadj= 21.0 (95% CI: 14.2 - 31.1)
Whether or not to accept the “adjusted” summary estimate in favor of the crude? • Methodologic literature is inconsistent on this • Bias-variance tradeoff • Scientifically most rigorous approach would appear to be to create two lists of potential confounders prior to the analysis: • A List: Those factors for which you will accept the adjusted result no matter how small the difference from the crude. • Factors you know must be confounders • B List: Those factors for which you will accept the adjusted result only if it meaningfully differs from the crude (with some pre-specified difference, e.g., 10%) • Factors you are less sure about • For some analyses, may have no factors on A list. For other analyses, no factors on B list. • Always putting all factors on A list may seem conservative, but not necessarily the right thing to do in that there may be a penalty in statistical imprecision
Stratifying by Multiple Potential Confounders Crude Stratified <40 smokers 40-60 smokers >60 smokers <40 non-smokers 40-60 non-smokers >60 non-smokers
The Need for Evaluation of Joint Confounding • Variables that evaluated alone show no confounding may show confounding when evaluated jointly Crude Stratified by Factor 1 alone by Factor 2 alone by Factor 1 & 2
Approaches for When More than One Potential Confounder is Present • Backward versus forward confounder evaluation strategies • relevant both for stratification and especially multivariable modeling (“model selection”) • Backwards Strategy • initially evaluate all potential confounders together (i.e., look for joint confounding) • conceptually preferred because in nature variables are all present and act together • Procedure: • with all potential confounders considered, form adjusted estimate. This is the “gold standard” • one variable can then be dropped and the adjusted estimate is re-calculated (adjusted for remaining variables) • if the dropping of the first variable results in a non-meaningful (eg <10%) change compared to the gold standard, it can be eliminated • procedure continues until no more variables can be dropped (i.e. are remaining variables are relevant) • Problem: • with many potential confounders, cells become very sparse and some strata provide no information
Example: Backwards Selection • Research question: Is prior hospitalization associated with the presence of methicillin-resistant S. aureus (MRSA)? (from Kleinbaum 2003) • Outcome variable: MRSA (present or absent) • Primary predictor: prior hospitalization (yes/no) • Potential confounders: age (<55, >55), gender, prior antibiotic use (atbxuse; yes/no) • Assume no interaction • Which OR to report?
Approaches for When More than One Potential Confounder is Present • Forward Strategy • start with the variable that has the biggest “change-in-estimate” impact • then add the variable with the second biggest impact • keep this variable if its presence meaningfully changes the adjusted estimate • procedure continues until no other added variable has an important impact • Advantage: • avoids the initial sparse cell problem of backwards approach • Problem: • does not evaluate joint confounding effects of many variables
An Analysis Plan • Written before the data are analyzed • Content • Detailed description of the techniques to be used to both explore and formally analyze data • Forms the basis of “Statistical Analysis” section in manuscripts • Parameters/rules/logic to guide key decisions: • which variables will be assessed for confounding and interaction? • what p value will be used to guide reporting of interaction? • what is a meaningful difference between two estimates (e.g. 10%)? • Required for clinical trial registration • Can observational work be far behind? • Utility: A plan helps to keep the analysis: • Focused • Reproducible • Honest (avoids p value shopping)
Stratification to Manage Confounding • Advantages • straightforward to implement and comprehend • easy way to evaluate interaction • Limitations • Looks at only one exposure-disease assoc. at a time • Requires continuous variables to be discretized • loses information; possibly results in “residual confounding” • Deteriorates with multiple confounders • e.g. suppose 4 confounders with 3 levels • 3x3x3x3=81 strata needed • unless huge sample, many cells have “0”’s and strata have undefined effect measures • Solution: • Mathematical modeling (multivariable regression) • e.g. • linear regression • logistic regression • proportional hazards regression