1.51k likes | 1.67k Views
Ockham’s Razor in Causal Discovery: A New Explanation. Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/faculty-kelly.php. I. Prediction vs. Policy . Predictive Links.
E N D
Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/faculty-kelly.php
Predictive Links • Correlation or co-dependency allows one to predictY from X. Ash trays Linked to Lung cancer! Lung cancer Ash trays policy maker scientist
Policy • Policy manipulatesX to achieve a change in Y. Ash trays Linked to Lung cancer! Prohibit ash trays! Lung cancer Ash trays
Policy • Policy manipulatesX to achieve a change in Y. We failed! Lung cancer Ash trays
Correlation is not Causation • Manipulation of X can destroy the correlation of X with Y. We failed! Lung cancer Ash trays
Lung cancer Ash trays Standard Remedy • Randomized controlled study That’s what happens if you carry out the policy.
Infeasibility • Expense • Morality Let me force a few thousand children to eat lead. IQ Lead
Infeasibility • Expense • Morality Just joking! IQ Lead
Ironic Alliance Ha! You will never prove that lead affects IQ… industry IQ Lead
Ironic Alliance And you can’t throw my people out of work on a mere whim. IQ Lead
Ironic Alliance So I will keep on polluting, which will never settle the matter because it is not a randomized trial. IQ Lead
Causal Discovery • Patterns of conditional correlation can imply unambiguous causal conclusions • (Pearl, Spirtes, Glymour, Scheines, etc.) Protein A Protein C Cancer protein Protein B Eliminate protein C!
Z X Y compatibility Z Y X Basic Idea • Causation is a directed, acyclic network over variables. • What makes a network causal is a relation of compatibility between networks and joint probability distributions. p G
Compatibility Joint distribution p is compatible with directed, acyclic network G iff: Causal Markov Condition: each variable X is independent of its non-effects given its immediate causes. Faithfulness Condition: every conditional independence relation that holds in p is a consequence of the Causal Markov Cond. V V Y Z X W
C Common Cause • B yields info about C (Faithfulness); • B yields no further info about C given A (Markov). A A B C B
B A C Causal Chain • B yields info about C (Faithfulness); • B yields no further info about C given A (Markov). B A C
B C A Common Effect • B yields no info about C (Markov); • B yields extra info about C given A (Faithfulness). B C A
Distinguishability indistinguishable distinctive B C A B C A A B C A C B
Immediate Connections • There is an immediate causal connection between X and Y iff • X is dependent on Y given every subset of variables not containing X and Y(Spirtes, Glymour and Scheines) Z X Y X Y W No intermediate conditioning set breaks dependency Some conditioning set breaks dependency
Recovery of Skeleton • Apply preceding condition to recover every non-oriented immediate causal connection. X Y X Y skeleton truth Y Y Z Z
Orientation of Skeleton • Look for the distinctive pattern of common effects. Common effect X Y X Y truth Y Y Z Z
Orientation of Skeleton • Look for the distinctive pattern of common effects. • Draw all deductive consequences of these orientations. Common effect X Y X Y Y is not common effect of ZY So orientation must be downward truth Y Y Z Z
Causation from Correlation • The following network is causally unambiguous if all variables are observed. Protein A Protein C Cancer protein Protein B
Causation from Correlation • The red arrow is also immune to latent confounding causes Protein A Protein C Cancer protein Protein B
Brave New World for Policy • Experimental (confounder-proof) conclusions from correlational data! Protein A Protein C Cancer protein Protein B Eliminate protein C!
Inferred statistical dependencies Metaphysics vs. Inference • The above results all assume that the true statistical independence relations for p are given. • But they must be inferred from finite samples. Sample Causal conclusions
Problem of Induction • Independence is indistinguishable from sufficiently small dependence at sample size n. data dependence independence
Bridging the Inductive Gap • Assume conditional independence until the data show otherwise. • Ockham’s razor: assume no more causal complexity than necessary.
Inferential Instability • No guarantee that small dependencies will not be detected later. • Can have spectacular impact on prior causal conclusions.
Current Policy Analysis Protein A Protein A Protein C Protein C Cancer protein Cancer protein Protein B Protein B Eliminate protein C!
As Sample Size Increases… Protein A Protein C Cancer protein weak Protein B Protein D Rescind that order!
As Sample Size Increases Again… Protein E Protein A weak Protein C Cancer protein weak Protein B weak Protein D Eliminate protein C again!
As Sample Size Increases Again… Protein E Protein A weak Protein C Cancer protein weak Protein B weak Etc. Protein D Eliminate protein C again!
Typical Applications • Linear Causal Case: each variable X is a linear function of its parents and a normally distributed hidden variable called an “error term”. The error terms are mutually independent. • Discrete Multinomial Case: each variable X takes on a finite range of values.
Genetics Smoking Cancer An Optimistic Concession • No unobserved latent confounding causes
Causal Flipping Theorem • No matter what a consistent causal discovery procedure has seen so far, there exists a pair G, p satisfying the above assumptions so that the current sample is arbitrarily likely in p and the procedure produces arbitrarily many opposite conclusions in p about an arbitrary causal arrow in G as sample size increases. oops I meant oops I meant oops I meant
Causal Flipping Theorem • Every consistent causal inference method is covered. • Therefore, multiple instability is an intrinsic feature of the causal discovery problem. oops I meant oops I meant oops I meant
The Crooked Course "Living in the midst of ignorance and considering themselves intelligent and enlightened, the senseless people go round and round, following crooked courses, just like the blind led by the blind." Katha Upanishad, I. ii. 5.
Extremist Reaction • Since causal discovery cannot lead straight to the truth, it is not justified. I must remain silent. Therefore, I win.
Moderate Reaction • Many explanations have been offered to make sense of the here-today-gone-tomorrow nature of medical wisdom — what we are advised with confidence one year is reversed the next — but the simplest one is that it is the natural rhythm of science. • (Do We Really Know What Makes us Healthy?, NY Times Magazine, Sept. 16, 2007).
Skepticism Inverted • Unavoidable retractions are justified because they are unavoidable. • Avoidable retractions are not justified because they are avoidable. • So the best possible methods for causal discovery are those that minimize causal retractions. • The best possible means for finding the truth are justified.
Larger Proposal • The same holds for Ockham’s razor in general when the aim is to find the true theory.
Ockham Says: Choose the Simplest!
But Why? Gotcha!
Puzzle • An indicator must be sensitive to what it indicates. simple