Causal Diagrams for Epidemiological Research

Causal Diagrams for Epidemiological Research Eyal Shahar, MD, MPH Professor Division of Epidemiology & Biostatistics Mel and Enid Zuckerman College of Public Health The University of Arizona

What is it and why does it matter? A tool (method) that: • clarifies our wordy or vague causal thoughts about the research topic • helps us to decide which covariates should enter the statistical model—and which should not • unifies our understanding of confounding bias, selection bias, and information bias

What is the key question in a non-randomized study? When estimating the effect of E (“exposure”) on D (“disease”), what should we adjust for? or Confounder selection strategy

Adjusting for ConfoundersCommon Practice • The “change-in-estimate” method • List “potential confounders” • Adjust for (condition on) potential confounders • Compare adjusted estimate to crude estimate (or “fully adjusted” to “partially adjusted”) • Decide whether “potential confounders” were “real confounders” • Decide how much confounding existed • Premise: The data informs us about confounding. • Are we asking too much from the data?

Adjusting for ConfoundersCommon Practice • What is “a potential confounder”? • Typically, “a cause of the disease that is associated with the exposure” Confounder E D • What is the effect of a confounder? • Contributes to the crude (observed, marginal) association between E and D

Adjusting for ConfoundersCommon Practice • Extension to multiple confounders C1 C3 C2 E E D E D D C4 C6 C5 E E D E D D

Adjusting for ConfoundersCommon PracticeProblems • A sequence of isolated, independent, causal diagrams • but C1, C2, C3, C4, C5,.. might be connected causally • Unidirectional arrow = a causal direction • but what is the meaning of the bidirectional arrow? • Even with a single confounder, the “change-in-estimate” method could fail

Adjusting for ConfoundersProblems • An example where the “change-in-estimate” method fails U1 U2 C E D • The crude estimate may be closer to the truth than the C-adjusted estimate • To be explained

AlternativeA Causal Diagram • A method for selecting covariates • Extension of the confounder triangle • Premises displayed in the diagram • New terms: • Path • Collider on a path • Confounding path

Selected references • Pearl J. Causality: models, reasoning, and inference. 2000. Cambridge University Press • Greenland S et al. Causal diagrams for epidemiologic research. Epidemiology 1999;10:37-48 • Robins JM. Data, design, and background knowledge in etiologic inference. Epidemiology 2001;11:313-320 • Hernan MA et al. A structural approach to selection bias. Epidemiology 2004;15:615-625 • Shahar E. Causal diagrams for encoding and evaluation of information bias. J Eval Clin Pract (forthcoming)

A Causal Diagram Notation and Terms • An arrow=causal direction between two variables E D • An arrow could abbreviate both direct and indirect effects U1 E E D D could summarize U2 U3

A Causal Diagram Notation and Terms • A path between E and D: any sequence of causal arrows that connects E to D E D E U1 U2 D E U1 U2 D E U1 U2 D

A Causal Diagram Notation and Terms • Circularity (self-causation) does not exist: Directed Acyclic Graph E U1 D U2 • A collider on the path between E and D E U1 U2 D • E and U2 collide at U1

A Causal Diagram Notation and Terms • A confounding path for the effect of E on D: Any path between E and D that meets the following criteria: • The arrow next to E points to E • There are no colliders on the path C U1 V1 U2 V2 U3 E D In short: a path showing a common cause of E and D

C • The paths below are NOTconfounding paths for the effect of E on D U1 V1 U2 C V2 U3 U1 V1 E D U2 C V2 U3 U1 V1 E D U2 V2 U3 E D

What can affect the association between E and D?(Why do we observe an association between two variables?) • Causal path: E causes D • Causal path: D causes E • Confounding paths • Adjustment for colliders on a path from E to D E D D E C E D Later…

Why does a confounding path affect the crude (marginal) association between E and D? Intuitively: • Association= being able to “guess” the value of one variable (D) from the value of another (E) • ED allows us to guess D from E (and E from D) • A confounding path allows for sequential guesses along the path C U1 V1 U2 V2 U3 E D

How can we block a confounding path between E and D? • Condition on a variable on the path (on any variable) • Methods for conditioning • Restriction • Stratification • Regression C U1 V1 U2 V2 U3 E D

A point to remember • We don’t need to adjust for confounders (the top of the triangle.) Adjustment for any U or V below will do. • U and V are surrogates for the confounder C C U1 V1 U2 V2 U3 E D

Example • If the diagram below corresponds to reality, then we have several options for conditioning • For example: • On C and U2 • Only on U2 • Only on U3 C U1 V1 U2 V2 U3 E D

What can affect the association between E and D? • Causal path: E causes D • Causal path: D causes E • Confounding paths • Adjustment for colliders on a path from E to D E D D E C E D NOW!

Collider Confounder Conditioning on a ColliderA Trap • A collider may be viewed as the opposite of a confounder • Collider and confounder are symmetrical entities, like matter and anti-matter C U1 V1 U2 V2 U3 E D

Conditioning on a ColliderA Trap • A path from E to D that contains a collider is NOT a confounding path. There is no transfer of “guesses” across a collider. • A path from E to D that contains a collider does NOT generate an association between E and D • Conditioning on the collider, however, will turn that path into a confounding path. Why?

Conditioning on a ColliderA Trap C V1 U1 U2 V2 U3 E D The horizontal line indicates an association (the possibility of “guesses”) that was induced by conditioning on a collider

Properties of a ColliderIntuitive Explanation • A dataset contains three variables for N cars: • Brake condition (good/bad) • Street condition in the owner’s town (good/bad) • Involved in an accident in the owner’s town? (yes/no) Brake condition (good, bad) Accident (yes, no) Street condition (good, bad) • Accident is a collider. • Brake condition and street condition are not associated in the dataset. We cannot use the data to guess one from the other.

Properties of a ColliderIntuitive Explanation • Why can’t we make a guess from the data? • Let’s try. Suppose we are told: • Car A has good brakes and car B has bad brakes. • This information tells us nothing about the street condition in each owner’s town. • Intuition: a common effect (collider) does not induce an association between its causes (colliding variables)

Properties of a ColliderIntuitive Explanation • If, however, we condition (stratify) on the collider “accident”, we can make some guesses about the street condition from the brake condition. Stratum #1 Accident = yes

Properties of a ColliderIntuitive Explanation • Similarly, in the other stratum Stratum #2 Accident = no

Properties of a Collider In summary: • Conditioning on a collider creates an association between the colliding variables and, therefore, may open a confounding path Before conditioning on C After conditioning on C U1 U1 U2 U2 C C E E D D

Derivations • The “change-in-estimate” method could fail if we condition on colliders, and thereby open confounding paths • To (rationally) select covariates for adjustment, we must commit to a causal diagram (premises) (But we often say that we don’t know and can’t commit, and hope that the change-in-estimate method will work.) Causal inference, like all scientific inference, is conditional on premises (which may be false)—not on ignorance

Derivations • Do not condition on colliders, if possible • If you condition on a collider, • Connect the colliding variables by a line • Check if you opened a new confounding path • Condition on another variable to block that new path Conditioning on C and (U1orU2) Conditioning on C alone U1 U1 U2 U2 C C E E D D

Practical advice • Study one exposure at a time • A model that may be good for exposure A might not be good for exposure B (even if B is in the model) • Never adjust for an effect of the exposure • Never adjust for an effect of the disease • Never select covariates by stepwise regression • Never look at p-values to decide on confounding • (actually, never look at p-values…)

Extension to other problems of causal inquiry • Causation always remains uncertain, even if we deal with a single confounder Unbeknown to us the reality happens to be We draw U1 U2 C C E E D D And naively condition on C And our adjustment may fail

U U I D D E Extension to other problems of causal inquiry • Estimating the “direct” effect by conditioning on an intermediary variable, I I D E • We should remember that variable I may be a collider I E

Extension to other problems of causal inquiry • Causal diagrams explain the mechanism of selection bias • Example: What happens if we estimate the effect of marital status on dementia in a sample of nursing home residents? Assume: no effect both variables affect “place of residence” (home, or nursing home)

Extension to other problems of causal inquiry Marital status Dementia Place of residence (home, nursing home) • By studying a sample of nursing home residents, we are conditioning on a collider (on a “sampling collider”) and might create an association between marital status and dementia in that stratum

Maritalstatus Dementia Extension to other problems of causal inquiry Marital status Dementia Place of residence (home, nursing home) “Stratification” Home Nursing home

Estrogen MI E D Source cohort: no effect Selection into a case-control sample S=1 S (0,1) DS because disease status affects selection. Diseased members of the cohort are over-sampled (cases) relative to non-diseased (controls) Estrogen MI E D Suppose: F is hip fracture F Suppose: EF Suppose: Controls preferentially selected from women with hip fracture S (0,1) Extensions: control selection bias(Source: Hernan et al, Epidemiology 2004)

Extensions: control selection bias(Source: Hernan et al, Epidemiology 2004) Estrogen MI E D F S (0,1) S=1 (our case-control sample) S=0 (remainder of the source cohort) HRT MI E D Association of E and D was created

Diagnosed endometrial cancer Estrogenuse Endometrial cancer ? E D D* Z Frequency of exams Vaginal bleeding Extensions: information bias(LAST EXAMPLE)

Summary Points • The “change-in-estimate” method could fail if we condition on colliders, and thereby open confounding paths • The theory of causal diagrams extends the idea of a confounder to the multi-confounder case • Unification of confounding bias, selection bias, and information bias under a single theoretical framework

“Back-door algorithm” • Sufficient set for adjustment • Minimally sufficient set • Differential losses to follow-up • Time-dependent confounders • Interpretation of hazard ratios • Conditioning on a common effect always induced an association between its causes, but this association could be restricted to some levels of the common effect

Age (young, old) Smoking drive (low, high) Sex Physical activity (low, high) Asthma (yes, no) ? Smoking status FEV1

Ulcer Pneumonia Hospitalization Status hospitalized not hospitalized ? Abdominal Pain Coughing Stratification hospitalized patients other patients Ulcer Pneumonia ? Abdominal Pain Coughing

Example: Do men have higher systolic blood pressure than women? (In other words: estimate the gender effect on systolic blood pressure) The following table summarizes the answer to this question from two regression models So, which is the true estimate and which is biased?

WHR Gender SBP BMI Z1 Z2 . .

U WHR Gender SBP BMI Z1 Z2 . .

Causal Diagrams for Epidemiological Research

Causal Diagrams for Epidemiological Research

Presentation Transcript

Problem Definition and Causal Loop Diagrams

Descriptive and Causal Research Designs

Causal-Comparative Research

Causal-Comparative Research

Causal-Comparative Research

CAUSAL-COMPARATIVE RESEARCH

Educational Research: Causal-Comparative Studies

Clinical and Epidemiological Research on Aging

Educational Research: Causal-Comparative Studies

Causal Diagrams and the Identification of Causal Effects

Causal-Comparative Research Designs

Epidemiological Research

Causal Diagrams for Policy Analysis

Acceleration of Inductive Inference of Causal Diagrams

Causal Directed Acyclic Graphs (DAG) (Causal Diagrams) 2013 Eyal Shahar, MD, MPH Professor

Causal Diagrams -- DAGs

Correlational and Causal Comparative Research

Research Design: Causal Studies

Causal Comparative Research: Purpose

Sampling, Causal Research

Research Design: Causal Studies

Causal Diagrams and the Identification of Causal Effects