1.15k likes | 1.23k Views
DAGs intro, Epidemiology 8h DAG=Directed Acyclic Graph. Hein Stigum http://folk.uio.no/heins/ courses. Jan-17. Jan-17. Jan-17. Jan-17. Jan-17. H.S. H.S. H.S. H.S. 1. 1. 1. 1. 1. Agenda. DAG concepts Causal thinking, Paths Analyzing DAGs Examples
E N D
DAGs intro, Epidemiology 8hDAG=Directed Acyclic Graph Hein Stigum http://folk.uio.no/heins/ courses Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 1 1 1 1 1
Agenda • DAG concepts • Causal thinking, Paths • Analyzing DAGs • Examples • DAGs and stat/epi phenomena • Selection bias • Mediation • Time dependent confounding • Effects of adjustments • Drawing DAGs • Limitations, problems Exercises Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 2 2 2 2 2
Background • Potential outcomes: Neyman, 1923 • Causal path diagrams: Wright, 1920 • Causal DAGs: Pearl, 2000 H.S.
Regression purpose • Prediction models • Predict the outcome from the covariates • Ex: Air pollution from distance to roads • Estimation models • Estimate effect of exposure on outcome • Ex: Smokers have RR=20 for lung cancer DAGs are of no interrest DAGs are important H.S.
Why causal graphs? • Estimate effect of exposure on disease • Problem • Association measures are biased • Causal graphs help: • Understanding • Confounding, mediation, selection bias • Analysis • Adjust or not • Discussion • Precise statement of prior assumptions H.S.
Causal versus casual CONCEPTS (Rothman et al. 2008; Veieroed et al. 2012 H.S.
god-DAG Causal Graph: Node = variable Arrow = cause E=exposure, D=disease DAG=Directed Acyclic Graph Read of the DAG: Causality = arrow Association = path Independency = no path Estimations: E-D association has two parts: ED causal effect keep open ECUD bias try to close Conditioning (Adjusting): E[C]UD Time Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. 7 7 7
Association and Cause 3 possible causal structure Association (reverse cause) E D + more complicated structures Jan-17 Jan-17 H.S. H.S. H.S. 8 8
A confounder induces an association between its effects Conditioning on a confounder removes the association Condition = (restrict, stratify, adjust) Paths Simplest form Causal confounding, (exception: see outcome dependent selection) Confounder idea + A common cause Adjust for smoking Smoking Smoking + + + + Yellow fingers Lung cancer Yellow fingers Lung cancer Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. 9 9 9 9
Conditioning on a collider induces an association between its causes “And” and “or” selection leads to different bias Paths Simplest form Collider idea • or • + and Two causes for selection to study Selected subjects Selected Selected + + + + Yellow fingers Lung cancer Yellow fingers Lung cancer Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. 10 10 10 10
Mediator • Have found a cause (E) M effect indirect • How does it work? • Mediator (M) • Paths direct effect D E Use ordinary regression methods if: no E-M interaction and collapsible effect measures Otherwise,need new methods H.S.
Concepts: Summing up Associations visible in data. Causal structure from outside the data. DAG: no arrow means independence Cause E D M Cause with Mediator E D C Cause with Confounder E D K Cause with Collider E D H.S.
Aims in papers • Standard aim (in introduction) • “We what to estimate the association between E and D” • Problems • Imprecise many E-D association • Why adjust gives no rationale for adjusting • Solution • Be bold: • “We what to estimate the effect of E on D” • Or more realistic: • “We what to estimate the association closest to the effect of E on D ” H.S.
Regression before DAGs Use statistical criteria for variable selection Risk factors for D: Report all variables in the model as equals Association Both can be misleading! C E D H.S.
Statistical criteria for variable selection - Want the effect of E on D(E precedes D) - Observe the two associations C-E and C-D - Assume statistical criteria dictates adjusting for C (likelihood ratio, Akaike (赤池 弘次) or 10% change in estimate) C E D The undirected graph above is compatible with three DAGs: C C C E D E D E D Confounder 1. Adjust Mediator 2. Direct: adjust 3. Total: not adjust Collider 4. Not adjust Conclusion: The data driven method is correct in 2 out of 4 situations Need information from outside the data to do a proper analysis DAGs variable selection: close all non-causal paths H.S.
Reporting variable as equals: Association versus causation Use statistical criteria for variable selection Risk factors for D: Report all variables in the model as equals Association Causation Base adjustments on a DAG C C Report only the E-effect or use different models for each variable E D E D Symmetrical Directional C is a confounder for E-D C is a confounder for ED E is a confounder for C-D E is a mediator for CD Westreich& Greenland 2013 H.S.
Exercise: report variables as equals? Risk factors for Fractures Interpret as effect of: Diabetes adjusted for all other vars. Phy. act. adjusted for all other vars. O B P D E Obesity adjusted for all other vars. fractures diabetes 2 physical activity bone density obesity Bone d. adjusted for all other vars. P is a confounder for E→D, but is E a confounder for P→D? Which effects are reported correctly in the table? 5 min H.S.
Causal thinking: Summing up Jan-17 Make a clear aim Data driven analyses do not work Need causal information from outside the data. (Data driven prediction models OK though). Reporting table of adjusted associations is misleading. Simpson’s paradox: causal information resolves the paradox. Jan-17 H.S. H.S. H.S. 19 19
Paths The Path of the Righteous Jan-17 Jan-17 H.S. H.S. H.S. 20 20
Path definitions Path: any trail from E to D(without repeating itself) Type: causal, non-causal State: open, closed Four paths: Goal: Keep causal paths of interest open Close all non-causal paths Jan-17 H.S. H.S. 21
Four rules 1. Causal path: ED (all arrows in the same direction) otherwise non-causal Before conditioning: 2. Closed path: K (closed at a collider, otherwise open) Conditioning on: 3. a non-collider closes: [M] or [C] 4. a collider opens: [K] (or a descendant of a collider) H.S.
ANALYZING DAGs H.S.
Confounding examples Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 24 24 24 24 24
Vitamin and birth defects Is there a bias in the crude E-D effect? Should we adjust for C? What happens if age also has a direct effect on D? Bias This is an example of confounding Question: Is U a confounder? No bias Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. 25 25 25
Exercise: Physical activity and Coronary Heart Disease (CHD) • We want the total effect of Physical Activity on CHD. • Write down the paths. • Are they causal/non-causal, open/closed? • What should we adjust for? 5 minutes Jan-17 Jan-17 H.S. H.S. 26 26
Intermediate variables Direct and indirect effects Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 27 27 27 27 27
Exercise: Tea and depression • Write down the paths. • You want the total effect of tea on depression. What would you adjust for? • You want the direct effect of tea on depression. What would you adjust for? • Is caffeine an intermediate variable or a variable on a confounder path? 10 minutes Hintikka et al. 2005 H.S.
Exercise: Statin and CHD D U C E lifestyle CHD statin cholesterol 10 minutes Jan-17 • Write down the paths. • You want the total effect of statin on CHD. What would you adjust for? • If lifestyle is unmeasured, can we estimate the direct effect of statin on CHD (not mediated through cholesterol)? • Is cholesterol an intermediate variable or a collider? H.S. H.S. 29
Mixed Confounder, collider and mediator Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 30 30 30 30 30
Diabetes and Fractures We want the total effect of Diabetes (type 2) on fractures Questions: Paths ←→? More paths? B a collider? V and P ind? Mediators Confounders H.S.
Selection bias Three concepts Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 32 32 32 32 32
Selection bias: concept 1Simple version • “Selected different from unselected” • Prevalence (D) Old have lower prevalence than young Old respond less to survey Selection bias: prevalence overestimated • Effect (E→D) Old have lower effect of E than young Old respond less to survey Selection bias: effect of E overestimated H.S.
Selection bias: concept 1“Selected different from unselected” S age smoke CHD Normally, selection variables unknown • Properties: • - Need smoke-age interaction • - Cannotbe adjusted for, but stratum effects OK • True RR=weighted average of stratum effects • RR in “natural” range (2.0-4.0) • Scale dependent Name: interaction based? H.S.
Selection bias: concept 2Simple version • “Distorted E-D distributions” • DAG Collider bias • Words Selection by sex and/or age • Distorted sex-age distribution Old have more disease Men are more exposedDistorted E - D distribution H.S.
Selection bias: concept 2“Distorted E-D distributions” S sex age smoke CHD • Properties: • Open non-causal path (collider) • Does not need interaction • Can be adjusted for(sex or age) • Not in “natural” range (“surprising bias”) Name: Collider stratification bias Selection bias types: Berkson’s, loss to follow up, nonresponse, self-selection, healthy worker Hernan et al, A structural approach to selection bias, Epidemiology 2004 H.S.
Selection bias: concept 3 Outcome dependent selection S D E Selection into the study based on D. Get bias among selected. U • Explanation: • Always have exogenous U. • D is a collider on E→D←U, S is a descendant of collider D. • Conditioning on (a descendant of) a collider opens the E→D←U path, and U becomes associated with E. • U now acts a confounder for E→D. Selection depends on: Strength of E→D. Strength of U→D Example of non-causal confounding Unmatched Case-Control H.S.
Exercise: Dust and COPDChronic Obstructive Pulmonary Disease • Calculate the RR of dust on COPD in good and poor health groups. • Write down the paths for the effect of E on D. E0 and D0 are unknown (past) measures. • What would you adjust for? • Suppose the crude effect of dust on COPD is RR=0.7 and the true RR=2. What do you call this bias? • Could the concept 1 (interaction based) selection bias work here? D0 E0 D E S H health COPD prior dust cur. dust diseases cur. worker COPD risks: 10 minutes H.S.
Convenience sample, homogenous sample • Convenience: • Conduct the study among • hospital patients? D E H 2. Homogeneous sample: Population data, exclude hospital patients? fractures diabetes hospital Collider, selection bias Collider stratification bias: at least on stratum is biased H.S.
Selection bias summing up S S age S sex age smoke CHD smoke CHD U smoke CHD Quite different concepts H.S.
Mediation Analysis Hafemanand Schwartz 2009; Lange and Hansen 2011; Pearl 2012; Robins and Greenland 1992; VanderWeele2009, 2014 H.S.
Why mediation analysis? • Have found a cause • How does it work? M effect indirect direct effect Y A H.S.
Counterfactual causal effect • Two possible outcome variables • Outcome if treated: Y1 • Outcome if untreated: Y0 Counterfactuals Potential outcomes • Causal effect • Individual:Y1i-Y0i • Average:E(Y1)-E(Y0) or other effect measures • Fundamental problem: either Y1or Y0 is missing Hernan 2004 H.S.
Controlled Direct effect Direct effect: Effect of statin on CHD “for the same cholesterol” Fixed M Fixed M: controlled direct effect Y A M CDE=E(Y|A=1,M=m) - E(Y|A=0,M=m) CHD statin cholesterol m Problems Conceptual: Can we fix cholesterol levels? Technical: A*M Interaction? (Technical: non-linear models?) 0/1 Solution? Robins and Greenland 1992; VanderWeele 2009 Jan-17 H.S. H.S. 46
Natural Direct effect Direct effect: Effect of statin on CHD “for the same cholesterol” Y A M CHD statin cholesterol M1 M0 A set to 1 A set to 0 M0 M0 Natural Direct Effect: Keep M at M0 Takes care of the 3 earlier problems: Don’t need to fix M=m OK for interactions (OK for non-linear models) in 4 slides 0/1 Jan-17 H.S. H.S. 48
Motivating example Population: Exposure: Alcohol use at two time points Mediator/confounder: HDL cholesterol Outcome: Coronary Heart Disease HDL Time Dependent Confounder: a confounder (HDL) that depends on earlier exposure (A1) A1 A2 CHD Estimate: Joint effect of alcohol use (or effect of A1 and A2) Simplified DAG, several variables and arrows missing Could have HDL1 and HDL2 Common situation in patient-doctor follow up! HS