1 / 109

DAGs intro, Epidemiology 8h DAG=Directed Acyclic Graph

DAGs intro, Epidemiology 8h DAG=Directed Acyclic Graph. Hein Stigum http://folk.uio.no/heins/ courses. Jan-17. Jan-17. Jan-17. Jan-17. Jan-17. H.S. H.S. H.S. H.S. 1. 1. 1. 1. 1. Agenda. DAG concepts Causal thinking, Paths Analyzing DAGs Examples

margaritaf
Download Presentation

DAGs intro, Epidemiology 8h DAG=Directed Acyclic Graph

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DAGs intro, Epidemiology 8hDAG=Directed Acyclic Graph Hein Stigum http://folk.uio.no/heins/ courses Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 1 1 1 1 1

  2. Agenda • DAG concepts • Causal thinking, Paths • Analyzing DAGs • Examples • DAGs and stat/epi phenomena • Selection bias • Mediation • Time dependent confounding • Effects of adjustments • Drawing DAGs • Limitations, problems Exercises Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 2 2 2 2 2

  3. Background • Potential outcomes: Neyman, 1923 • Causal path diagrams: Wright, 1920 • Causal DAGs: Pearl, 2000 H.S.

  4. Regression purpose • Prediction models • Predict the outcome from the covariates • Ex: Air pollution from distance to roads • Estimation models • Estimate effect of exposure on outcome • Ex: Smokers have RR=20 for lung cancer DAGs are of no interrest DAGs are important H.S.

  5. Why causal graphs? • Estimate effect of exposure on disease • Problem • Association measures are biased • Causal graphs help: • Understanding • Confounding, mediation, selection bias • Analysis • Adjust or not • Discussion • Precise statement of prior assumptions H.S.

  6. Causal versus casual CONCEPTS (Rothman et al. 2008; Veieroed et al. 2012 H.S.

  7. god-DAG Causal Graph: Node = variable Arrow = cause E=exposure, D=disease DAG=Directed Acyclic Graph Read of the DAG: Causality = arrow Association = path Independency = no path Estimations: E-D association has two parts: ED causal effect keep open ECUD bias try to close Conditioning (Adjusting): E[C]UD Time  Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. 7 7 7

  8. Association and Cause 3 possible causal structure Association (reverse cause) E D + more complicated structures Jan-17 Jan-17 H.S. H.S. H.S. 8 8

  9. A confounder induces an association between its effects Conditioning on a confounder removes the association Condition = (restrict, stratify, adjust) Paths Simplest form Causal confounding, (exception: see outcome dependent selection) Confounder idea + A common cause Adjust for smoking Smoking Smoking + + + + Yellow fingers Lung cancer Yellow fingers Lung cancer Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. 9 9 9 9

  10. Conditioning on a collider induces an association between its causes “And” and “or” selection leads to different bias Paths Simplest form Collider idea • or • + and Two causes for selection to study Selected subjects Selected Selected + + + + Yellow fingers Lung cancer Yellow fingers Lung cancer Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. 10 10 10 10

  11. Mediator • Have found a cause (E) M effect indirect • How does it work? • Mediator (M) • Paths direct effect D E Use ordinary regression methods if: no E-M interaction and collapsible effect measures Otherwise,need new methods H.S.

  12. Concepts: Summing up Associations visible in data. Causal structure from outside the data. DAG: no arrow means independence Cause E D M Cause with Mediator E D C Cause with Confounder E D K Cause with Collider E D H.S.

  13. Causal thinking in analyses H.S.

  14. Aims in papers • Standard aim (in introduction) • “We what to estimate the association between E and D” • Problems • Imprecise many E-D association • Why adjust gives no rationale for adjusting • Solution • Be bold: • “We what to estimate the effect of E on D” • Or more realistic: • “We what to estimate the association closest to the effect of E on D ” H.S.

  15. Regression before DAGs Use statistical criteria for variable selection Risk factors for D: Report all variables in the model as equals Association Both can be misleading! C E D H.S.

  16. Statistical criteria for variable selection - Want the effect of E on D(E precedes D) - Observe the two associations C-E and C-D - Assume statistical criteria dictates adjusting for C (likelihood ratio, Akaike (赤池 弘次) or 10% change in estimate) C E D The undirected graph above is compatible with three DAGs: C C C E D E D E D Confounder 1. Adjust Mediator 2. Direct: adjust 3. Total: not adjust Collider 4. Not adjust Conclusion: The data driven method is correct in 2 out of 4 situations Need information from outside the data to do a proper analysis DAGs variable selection: close all non-causal paths H.S.

  17. Reporting variable as equals: Association versus causation Use statistical criteria for variable selection Risk factors for D: Report all variables in the model as equals Association Causation Base adjustments on a DAG C C Report only the E-effect or use different models for each variable E D E D Symmetrical Directional C is a confounder for E-D C is a confounder for ED E is a confounder for C-D E is a mediator for CD Westreich& Greenland 2013 H.S.

  18. Exercise: report variables as equals? Risk factors for Fractures Interpret as effect of: Diabetes adjusted for all other vars. Phy. act. adjusted for all other vars. O B P D E Obesity adjusted for all other vars. fractures diabetes 2 physical activity bone density obesity Bone d. adjusted for all other vars. P is a confounder for E→D, but is E a confounder for P→D? Which effects are reported correctly in the table? 5 min H.S.

  19. Causal thinking: Summing up Jan-17 Make a clear aim Data driven analyses do not work Need causal information from outside the data. (Data driven prediction models OK though). Reporting table of adjusted associations is misleading. Simpson’s paradox: causal information resolves the paradox. Jan-17 H.S. H.S. H.S. 19 19

  20. Paths The Path of the Righteous Jan-17 Jan-17 H.S. H.S. H.S. 20 20

  21. Path definitions Path: any trail from E to D(without repeating itself) Type: causal, non-causal State: open, closed Four paths: Goal: Keep causal paths of interest open Close all non-causal paths Jan-17 H.S. H.S. 21

  22. Four rules 1. Causal path: ED (all arrows in the same direction) otherwise non-causal Before conditioning: 2. Closed path: K (closed at a collider, otherwise open) Conditioning on: 3. a non-collider closes: [M] or [C] 4. a collider opens: [K] (or a descendant of a collider) H.S.

  23. ANALYZING DAGs H.S.

  24. Confounding examples Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 24 24 24 24 24

  25. Vitamin and birth defects Is there a bias in the crude E-D effect? Should we adjust for C? What happens if age also has a direct effect on D? Bias This is an example of confounding Question: Is U a confounder? No bias Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. 25 25 25

  26. Exercise: Physical activity and Coronary Heart Disease (CHD) • We want the total effect of Physical Activity on CHD. • Write down the paths. • Are they causal/non-causal, open/closed? • What should we adjust for? 5 minutes Jan-17 Jan-17 H.S. H.S. 26 26

  27. Intermediate variables Direct and indirect effects Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 27 27 27 27 27

  28. Exercise: Tea and depression • Write down the paths. • You want the total effect of tea on depression. What would you adjust for? • You want the direct effect of tea on depression. What would you adjust for? • Is caffeine an intermediate variable or a variable on a confounder path? 10 minutes Hintikka et al. 2005 H.S.

  29. Exercise: Statin and CHD D U C E lifestyle CHD statin cholesterol 10 minutes Jan-17 • Write down the paths. • You want the total effect of statin on CHD. What would you adjust for? • If lifestyle is unmeasured, can we estimate the direct effect of statin on CHD (not mediated through cholesterol)? • Is cholesterol an intermediate variable or a collider? H.S. H.S. 29

  30. Mixed Confounder, collider and mediator Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 30 30 30 30 30

  31. Diabetes and Fractures We want the total effect of Diabetes (type 2) on fractures Questions: Paths ←→? More paths? B a collider? V and P ind? Mediators Confounders H.S.

  32. Selection bias Three concepts Jan-17 Jan-17 Jan-17 Jan-17 Jan-17 H.S. H.S. H.S. H.S. H.S. 32 32 32 32 32

  33. Selection bias: concept 1Simple version • “Selected different from unselected” • Prevalence (D) Old have lower prevalence than young Old respond less to survey  Selection bias: prevalence overestimated • Effect (E→D) Old have lower effect of E than young Old respond less to survey  Selection bias: effect of E overestimated H.S.

  34. Selection bias: concept 1“Selected different from unselected” S age smoke CHD Normally, selection variables unknown • Properties: • - Need smoke-age interaction • - Cannotbe adjusted for, but stratum effects OK • True RR=weighted average of stratum effects • RR in “natural” range (2.0-4.0) • Scale dependent Name: interaction based? H.S.

  35. Selection bias: concept 2Simple version • “Distorted E-D distributions” • DAG Collider bias • Words Selection by sex and/or age • Distorted sex-age distribution Old have more disease Men are more exposedDistorted E - D distribution H.S.

  36. Selection bias: concept 2“Distorted E-D distributions” S sex age smoke CHD • Properties: • Open non-causal path (collider) • Does not need interaction • Can be adjusted for(sex or age) • Not in “natural” range (“surprising bias”) Name: Collider stratification bias Selection bias types: Berkson’s, loss to follow up, nonresponse, self-selection, healthy worker Hernan et al, A structural approach to selection bias, Epidemiology 2004 H.S.

  37. 1) “Exclusive or” selection H.S.

  38. Selection bias: concept 3 Outcome dependent selection S D E Selection into the study based on D. Get bias among selected. U • Explanation: • Always have exogenous U. • D is a collider on E→D←U, S is a descendant of collider D. • Conditioning on (a descendant of) a collider opens the E→D←U path, and U becomes associated with E. • U now acts a confounder for E→D. Selection depends on: Strength of E→D. Strength of U→D Example of non-causal confounding Unmatched Case-Control H.S.

  39. Exercise: Dust and COPDChronic Obstructive Pulmonary Disease • Calculate the RR of dust on COPD in good and poor health groups. • Write down the paths for the effect of E on D. E0 and D0 are unknown (past) measures. • What would you adjust for? • Suppose the crude effect of dust on COPD is RR=0.7 and the true RR=2. What do you call this bias? • Could the concept 1 (interaction based) selection bias work here? D0 E0 D E S H health COPD prior dust cur. dust diseases cur. worker COPD risks: 10 minutes H.S.

  40. Convenience sample, homogenous sample • Convenience: • Conduct the study among • hospital patients? D E H 2. Homogeneous sample: Population data, exclude hospital patients? fractures diabetes hospital Collider, selection bias Collider stratification bias: at least on stratum is biased H.S.

  41. Selection bias summing up S S age S sex age smoke CHD smoke CHD U smoke CHD Quite different concepts H.S.

  42. Mediation Analysis Hafemanand Schwartz 2009; Lange and Hansen 2011; Pearl 2012; Robins and Greenland 1992; VanderWeele2009, 2014 H.S.

  43. Why mediation analysis? • Have found a cause • How does it work? M effect indirect direct effect Y A H.S.

  44. Counterfactual causal effect • Two possible outcome variables • Outcome if treated: Y1 • Outcome if untreated: Y0 Counterfactuals Potential outcomes • Causal effect • Individual:Y1i-Y0i • Average:E(Y1)-E(Y0) or other effect measures • Fundamental problem: either Y1or Y0 is missing Hernan 2004 H.S.

  45. Classic approach: controlled effect H.S.

  46. Controlled Direct effect Direct effect: Effect of statin on CHD “for the same cholesterol” Fixed M Fixed M: controlled direct effect Y A M CDE=E(Y|A=1,M=m) - E(Y|A=0,M=m) CHD statin cholesterol m Problems Conceptual: Can we fix cholesterol levels? Technical: A*M Interaction? (Technical: non-linear models?) 0/1 Solution? Robins and Greenland 1992; VanderWeele 2009 Jan-17 H.S. H.S. 46

  47. New approach: natural effect H.S.

  48. Natural Direct effect Direct effect: Effect of statin on CHD “for the same cholesterol” Y A M CHD statin cholesterol M1 M0 A set to 1 A set to 0 M0 M0 Natural Direct Effect: Keep M at M0 Takes care of the 3 earlier problems: Don’t need to fix M=m OK for interactions (OK for non-linear models) in 4 slides 0/1 Jan-17 H.S. H.S. 48

  49. Time Dependent Confounding H.S.

  50. Motivating example Population: Exposure: Alcohol use at two time points Mediator/confounder: HDL cholesterol Outcome: Coronary Heart Disease HDL Time Dependent Confounder: a confounder (HDL) that depends on earlier exposure (A1) A1 A2 CHD Estimate: Joint effect of alcohol use (or effect of A1 and A2) Simplified DAG, several variables and arrows missing Could have HDL1 and HDL2 Common situation in patient-doctor follow up! HS

More Related