300 likes | 452 Views
Challenges in Causality . Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon. …your health?. …climate changes?. … the economy?.
E N D
Challenges in Causality Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon clopinet.com/causality
…your health? …climate changes? … the economy? Causal discovery What affects… Which actions will have beneficial effects? clopinet.com/causality
What is causality? • Many definitions: • Science • Philosophy • Law • Psychology • History • Religion • Engineering • “Cause is the effect concealed, effect is the cause revealed”(Hindu philosophy) clopinet.com/causality
An engineering view… clopinet.com/causality
The system External agent Systemic causality clopinet.com/causality
Feature Selection Y X Predict Y from features X1, X2, … Select most predictive features. clopinet.com/causality
Y Y X Causation Predict the consequences of actions: Under “manipulations” by an external agent, some features are no longer predictive. clopinet.com/causality
What is out there? clopinet.com/causality
Available data • A lot of “observational” data. Correlation Causality! • Experiments are often needed, but: • Costly • Unethical • Infeasible clopinet.com/causality
Causal discovery from “observational data” Example algorithm: PC(Peter Spirtes and Clarck Glymour, 1999) Let A, B, C Xand V X. Initialize with a fully connected un-oriented graph. • Conditional independence. Cut connection if Vs.t. (A B |V). • Colliders. In triplets A —C —B (A — B) if there is no subset V containing C s.t. A B |V, orient edgesas: A C B. • Constraint-propagation. Orient edges until no change: (i) If A B …C, and A —C then A C. (ii) If A B —C then B C. clopinet.com/causality
Difficulties • Violated assumptions: • Causal sufficiency • Markov equivalence • Faithfulness • Linearity • Gaussianity • Overfitting (statistical complexity): • Finite sample size • Algorithm efficiency (computational complexity): • Thousands of variables • Tens of thousands of examples clopinet.com/causality
Causality workbench clopinet.com/causality
Our approach What is the causal question? Why should we care? What is hard about it? Is this solvable? Is this a good benchmark? clopinet.com/causality
Challenge datasets Toy datasets First datasets clopinet.com/causality
On-line feed-back clopinet.com/causality
Our challenges Find… • Problems • Data • Metrics • Challenge protocols • Implementation clopinet.com/causality
Healthcare mass spec Marketing Ecology DALTON Conceptual ECONO Neuroscience Epidemiology Psychology TIED Climatology Internet Sociology Security Upcoming datasets clopinet.com/causality
Want to contribute data? • Real data: • Non confidential • Large number of samples • Large number of variables • Observational and experimental • Semi-artificial data: • Re-simulated • Real data + artificial variables clopinet.com/causality
Performance assessment clopinet.com/causality
Metrics • Fulfillment of an objective: • Future (prediction) • Past (counterfactual) • Causal relationships: • Existence • Strength • Degree clopinet.com/causality
Examples of objectives • Medicine and epidemiology • Maximize life expectancy • Maximize drug efficacy • Minimize contagion • Economy and marketing • Maximize Gross National Product (GNP) • Maximize sales • Minimize churn rate clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue LUCAS0: natural Car Accident Causality assessmentwith manipulations clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident Causality assessmentwith manipulations LUCAS1: manipulated clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident Causality assessmentwith manipulations LUCAS2: manipulated clopinet.com/causality
10 2 5 3 9 4 1 0 6 11 8 • Participants return: S=selected subset 7 11 4 1 2 3 (ordered or not). Goal driven causality • We define: • V=variables of interest • (e.g. MB, direct causes, ...) • We assess causal relevance: R=f(V,S). clopinet.com/causality
Causality assessmentwithout manipulation? clopinet.com/causality
P1 P2 P3 PT Probes Using artificial “probes” Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder LUCAP0: natural Coughing Fatigue Car Accident clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident P1 P2 P3 PT Probes Using artificial “probes” LUCAP1&2: manipulated clopinet.com/causality
Scoring using “probes” • What we can compute (Fscore): • Negative class = probes (here, all “non-causes”, all manipulated). • Positive class = other variables (may include causes and non causes). • What we want (Rscore): • Positive class = causes. • Negative class = non-causes. • What we get (asymptotically): Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal) clopinet.com/causality
Conclusion • Try our first challenge, learn, and win!!!! • WCCI08 Workshop. Hong-Kong, June, 2008 • travel grants for top ranking students. • Proceedings of JMLR. Top ranking entrants will be invited to write a paper. • Best paper award: free WCCI registration. • Prizes: P(i)=$100. P = n*sum P(i). • Your problem solved by dozens of research groups: • help us organize the next challenge! clopinet.com/causality