1 / 50

Causal Discovery

Explore non-experimental evidence, causal estimation, representation, and ideal interventions in understanding the link between causation and probability. This outline covers statistical causal models, structural equation models, causal graphs, and more.

mleahy
Download Presentation

Causal Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour, and many others Dept. of Philosophy & CALD Carnegie Mellon Nov. 13th, 2003

  2. Outline • Motivation • Representation • Connecting Causation to Probability (Independence) • Searching for Causal Models • Improving on Regression for Causal Inference Nov. 13th, 2003

  3. 1. Motivation Non-experimental Evidence TypicalPredictiveQuestions • Can we predict aggressiveness from the amount of violent TV watched • Can we predict crime rates from abortion rates 20 years ago Causal Questions: • Does watching violent TV cause Aggression? • I.e., if we change TV watching, will the level of Aggression change? Nov. 13th, 2003

  4. Causal Estimation Manipulated Probability P(Y | X set= x, Z=z) from Unmanipulated Probability P(Y | X = x, Z=z) When and how can we use non-experimental data to tell us about the effect of an intervention? Nov. 13th, 2003

  5. 2. Representation • Association & causal structure -qualitatively • Interventions • StatisticalCausal Models • Bayes Networks • Structural Equation Models Nov. 13th, 2003

  6. Causation & Association X and Y are associated (X _||_ Y) iff x1  x2 P(Y | X = x1)  P(Y | X = x2) Association is symmetric: X _||_ Y  Y _||_ X X is a cause of Y iff x1  x2 P(Y | X set= x1)  P(Y | X set= x2) Causation is asymmetric: X Y  Y X Nov. 13th, 2003

  7. Direct Causation X is a direct cause of Y relative to S, iff z,x1  x2 P(Y | X set= x1 , Zset=z)  P(Y | X set= x2 , Zset=z) where Z = S - {X,Y} Nov. 13th, 2003

  8. Causal Graphs Causal Graph G = {V,E} Each edge X  Y represents a direct causal claim: X is a direct cause of Y relative to V Chicken Pox Nov. 13th, 2003

  9. Causal Graphs Not Cause Complete Common Cause Complete Nov. 13th, 2003

  10. Modeling Ideal Interventions • Ideal Interventions (on a variable X): • Completely determine the value or distribution of a variable X • Directly Target only X • (no “fat hand”) • E.g., Variables: Confidence, Athletic Performance • Intervention 1: hypnosis for confidence • Intervention 2: anti-anxiety drug (also muscle relaxer) Nov. 13th, 2003

  11. Modeling Ideal Interventions Interventions on the Effect Post Pre-experimental System Room Temperature Sweaters On Nov. 13th, 2003

  12. Modeling Ideal Interventions Interventions on the Cause Post Pre-experimental System Sweaters On Room Temperature Nov. 13th, 2003

  13. Interventions & Causal Graphs • Model an ideal intervention by adding an “intervention” variable outside the original system • Erase all arrows pointing into the variable intervened upon Intervene to change Inf Post-intervention graph? Pre-intervention graph Nov. 13th, 2003

  14. Conditioningvs. Intervening P(Y | X = x1)vs. P(Y | X set= x1)Teeth Slides Nov. 13th, 2003

  15. Causal Bayes Networks The Joint Distribution Factors According to the Causal Graph, i.e., for all X in V P(V) = P(X|Immediate Causes of(X)) P(S = 0) = .7 P(S = 1) = .3 P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95 P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05 P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80 P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20 P(S,YF, L) = P(S) P(YF | S) P(LC | S) Nov. 13th, 2003

  16. Structural Equation Models 1. Structural Equations 2. Statistical Constraints Causal Graph Statistical Model Nov. 13th, 2003

  17. Structural Equation Models • Structural Equations: One Equation for each variable V in the graph: V = f(parents(V), errorV) for SEM (linear regression) f is a linear function • Statistical Constraints: Joint Distribution over the Error terms Causal Graph Nov. 13th, 2003

  18. Causal Graph SEM Graph (path diagram) Structural Equation Models Equations: Education = ed Income =Educationincome Longevity =EducationLongevity Statistical Constraints: (ed, Income,Income ) ~N(0,2) 2diagonal - no variance is zero Nov. 13th, 2003

  19. 3. ConnectingCausation to Probability Nov. 13th, 2003

  20. The Markov Condition Causal Structure Statistical Predictions Causal Markov Axiom Independence X _||_ Z | Y i.e., P(X | Y) = P(X | Y, Z) Causal Graphs Nov. 13th, 2003

  21. Causal Markov Axiom If G is a causal graph, and P a probability distribution over the variables in G, then in P: every variable V is independent of its non-effects, conditional on its immediate causes. Nov. 13th, 2003

  22. Causal Markov Condition Two Intuitions: 1) Immediate causes make effectsindependent of remote causes (Markov). 2) Common causes make their effectsindependent (Salmon). Nov. 13th, 2003

  23. Causal Markov Condition 1) Immediate causes make effectsindependent of remote causes (Markov). E = Exposure to Chicken Pox I = Infected S = Symptoms Markov Cond. E || S | I Nov. 13th, 2003

  24. Causal Markov Condition 2) Effects are independent conditional on their commoncauses. YF || LC | S Markov Cond. Nov. 13th, 2003

  25. Causal Structure Statistical Data Nov. 13th, 2003

  26. Causal Markov Axiom In SEMs, d-separation follows from assuming independence among error terms that have no connection in the path diagram - i.e., assuming that the model is common cause complete. Nov. 13th, 2003

  27. Causal Markov and D-Separation • In acyclic graphs: equivalent • Cyclic Linear SEMs with uncorrelated errors: • D-separation correct • Markov condition incorrect • Cyclic Discrete Variable Bayes Nets: • If equilibrium --> d-separation correct • Markov incorrect Nov. 13th, 2003

  28. P(X3 | X2)  P(X3 | X2, X1) X3 _||_ X1 | X2 P(X3 | X2 set= ) = P(X3 | X2 set=, X1) X3 _||_ X1 | X2set= D-separation: Conditioning vs. Intervening Nov. 13th, 2003

  29. 4. Search From Statistical Datato Probability toCausation Nov. 13th, 2003

  30. Statistical Inference Background Knowledge - X2 before X3 - no unmeasured common causes Causal DiscoveryStatistical DataCausal Structure Nov. 13th, 2003

  31. Representations ofD-separation Equivalence Classes We want the representations to: • Characterize the Independence Relations Entailed by the Equivalence Class • Represent causal features that are shared by every member of the equivalence class Nov. 13th, 2003

  32. Patterns & PAGs • Patterns(Verma and Pearl, 1990): graphical representation of an acyclic d-separation equivalence - no latent variables. • PAGs: (Richardson 1994) graphical representation of an equivalence class including latent variable models and sample selection bias that are d-separation equivalent over a set of measured variables X Nov. 13th, 2003

  33. Patterns Nov. 13th, 2003

  34. Patterns: What the Edges Mean Nov. 13th, 2003

  35. Patterns Nov. 13th, 2003

  36. PAGs: Partial Ancestral Graphs What PAG edges mean. Nov. 13th, 2003

  37. PAGs: Partial Ancestral Graph Nov. 13th, 2003

  38. Tetrad 4 Demo www.phil.cmu.edu/projects/tetrad_download/ Nov. 13th, 2003

  39. Overview of Search Methods • Constraint Based Searches • TETRAD • Scoring Searches • Scores: BIC, AIC, etc. • Search: Hill Climb, Genetic Alg., Simulated Annealing • Very difficult to extend to latent variable models Heckerman, Meek and Cooper (1999). “A Bayesian Approach to Causal Discovery” chp. 4 in Computation, Causation, and Discovery, ed. by Glymour and Cooper, MIT Press, pp. 141-166 Nov. 13th, 2003

  40. 5. Regessionand Causal Inference Nov. 13th, 2003

  41. Regression to estimate Causal Influence • Let V = {X,Y,T}, where - Y : measured outcome - measured regressors: X = {X1, X2, …, Xn}- latent common causes of pairs in X U Y: T = {T1, …, Tk} • Let the true causal model over V be a Structural Equation Model in which each V  V is a linear combination of its direct causes and independent, Gaussian noise. Nov. 13th, 2003

  42. Regression to estimate Causal Influence • Consider the regression equation: Y = b0 + b1X1 + b2X2 + ..…bnXn • Let the OLS regression estimate bi be the estimatedcausal influence of Xi on Y. • That is, holding X/Xi experimentally constant, bi is an estimate of the change in E(Y) that results from an intervention that changes Xi by 1 unit. • Let the realCausalInfluence Xi Y = bi • When is the OLS estimate bi an unbiased estimate of bi ? Nov. 13th, 2003

  43. Linear Regression Let the other regressors O = {X1, X2,....,Xi-1, Xi+1,...,Xn} bi = 0 if and only if Xi,Y.O = 0 In a multivariate normal distribuion, Xi,Y.O = 0if and only if Xi _||_ Y | O Nov. 13th, 2003

  44. Linear Regression So in regression: bi = 0 Xi _||_ Y | O But provably : bi = 0 S  O, Xi _||_ Y | S So S  O, Xi _||_ Y | S  bi = 0 ~ S  O, Xi _||_ Y | S  don’t know(unless we’re lucky) Nov. 13th, 2003

  45. X1 _||_ Y | X2 Regression Example b1 0 b2 = 0 X2 _||_ Y | X1 Don’t know ~S  {X2} X1 _||_ Y | S b2 = 0 S  {X1} X2 _||_ Y | {X1} Nov. 13th, 2003

  46. X1 _||_ Y | {X2,X3} X2 _||_ Y | {X1,X3} X3 _||_ Y | {X1,X2} Regression Example b1 0 b2 0 b3 0 DK ~S  {X2,X3}, X1 _||_ Y | S b2 = 0 S  {X1,X3}, X2 _||_ Y | {X1} DK ~S  {X1,X2}, X3 _||_ Y | S Nov. 13th, 2003

  47. X2 X1 X3 Y PAG Regression Example Nov. 13th, 2003

  48. Regression Bias If • Xi is d-separated from Y conditional on X/Xi in the true graph after removing Xi Y, and • X contains no descendant of Y, then: biis an unbiased estimate ofbi See “Using Path Diagrams ….” Nov. 13th, 2003

  49. Applications • Rock Classification • Spartina Grass • College Plans • Political Exclusion • Satellite Calibration • Naval Readiness • Parenting among Single, Black Mothers • Pneumonia • Photosynthesis • Lead - IQ • College Retention • Corn Exports Nov. 13th, 2003

  50. Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press) Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press Computation, Causation, & Discovery (1999), edited by C. Glymour and G. Cooper, MIT Press Causality in Crisis?, (1997) V. McKim and S. Turner (eds.), Univ. of Notre Dame Press. TETRAD IV: www.phil.cmu.edu/projects/tetrad Web Course on Causal and Statistical Reasoning : www.phil.cmu.edu/projects/csr/ References Nov. 13th, 2003

More Related