340 likes | 358 Views
Results of the Causality Challenge. Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon. …your health?. …climate changes?.
E N D
Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon clopinet.com/causality
…your health? …climate changes? … the economy? Causal discovery What affects… Which actions will have beneficial effects? clopinet.com/causality
The system External agent Systemic causality clopinet.com/causality
Feature Selection Y X Predict Y from features X1, X2, … Select most predictive features. clopinet.com/causality
Y Y X Causation Predict the consequences of actions: Under “manipulations” by an external agent, some features are no longer predictive. clopinet.com/causality
Challenge Design clopinet.com/causality
Available data • A lot of “observational” data. Correlation Causality! • Experiments are often needed, but: • Costly • Unethical • Infeasible • This challenge, semi-artificial data: • Re-simulated data • Real data with artificial “probes” clopinet.com/causality
Challenge datasets Toy datasets Four tasks clopinet.com/causality
On-line feed-back clopinet.com/causality
Difficulties • Violated assumptions: • Causal sufficiency • Markov equivalence • Faithfulness • Linearity • “Gaussianity” • Overfitting (statistical complexity): • Finite sample size • Algorithm efficiency (computational complexity): • Thousands of variables • Tens of thousands of examples clopinet.com/causality
Evaluation • Fulfillment of an objective • Prediction of a target variable • Predictions under manipulations • Causal relationships: • Existence • Strength • Degree clopinet.com/causality
Setting • Predict a target variable (on training and test data). • Return the set of features used. • Flexibility: • Sorted or unsorted list of features • Single prediction or table of results • Complete entry = xxx0, xxx1, xxx2 results (for at least one dataset). clopinet.com/causality
Metrics • Results ranked according to the test set target prediction performance “Tscore”: • We also assess directly the feature set with a “Fscore”, not used for ranking. clopinet.com/causality
Toy Examples clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue LUCAS0: natural Car Accident Causality assessmentwith manipulations clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident Causality assessmentwith manipulations LUCAS1: manipulated clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident Causality assessmentwith manipulations LUCAS2: manipulated clopinet.com/causality
10 2 5 3 9 4 1 0 6 11 8 • Participants return: S=selected subset 7 11 4 1 2 3 (ordered or not). Goal driven causality • We define: • V=variables of interest • (e.g. MB, direct causes, ...) • We assess causal relevance: Fscore=f(V,S). clopinet.com/causality
Causality assessmentwithout manipulation? clopinet.com/causality
P1 P2 P3 PT Probes Using artificial “probes” Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder LUCAP0: natural Coughing Fatigue Car Accident clopinet.com/causality
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident P1 P2 P3 PT Probes Using artificial “probes” LUCAP1&2: manipulated clopinet.com/causality
Scoring using “probes” • What we can compute (Fscore): • Negative class = probes (here, all “non-causes”, all manipulated). • Positive class = other variables (may include causes and non causes). • What we want (Rscore): • Positive class = causes. • Negative class = non-causes. • What we get (asymptotically): Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal) clopinet.com/causality
Results clopinet.com/causality
Challenge statistics • Start: December 15, 2007. • End: April 30, 2000 • Total duration: 20 weeks. • Last (complete) entry ranked: Number of ranked entrants Number of ranked submissions clopinet.com/causality
REGED SIDO 1 1 0.9 0.9 0.8 0.8 0.7 0.7 Tscore Tscore 0.6 0.6 0.5 0.5 0 0 0.4 0.4 1 1 2 2 0.3 0.3 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Days into the challenge Days into the challenge MARTI CINA 1 1 0.9 0.9 0.8 0.8 0.7 0.7 Tscore Tscore 0.6 0.6 0.5 0.5 0 0 0.4 0.4 1 1 2 2 0.3 0.3 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Days into the challenge Days into the challenge Learning curves clopinet.com/causality
AUC distribution clopinet.com/causality
REGED clopinet.com/causality
SIDO clopinet.com/causality
CINA clopinet.com/causality
MARTI clopinet.com/causality
Pairwise comparisons clopinet.com/causality
Top ranking methods • According to the rules of the challenge: • Yin Wen Chang: SVM => best prediction accuracy on REGED and CINA. Prize: $400 donated by Microsoft. • Gavin Cawley: Causal explorer + linear ridge regression ensembles => best prediction accuracy on SIDO and MARTI. Prize: $400 donated by Microsoft. • According to pairwise comparisons: • Jianxin Yin and Prof. Zhi Geng’s group: Partial Orientation and Local Structural Learning=> best on Pareto front, new original causal discovery algorithm. Prize: free WCCI 2008 registration. clopinet.com/causality
Pairwise comparisons REGED SIDO MARTI CINA clopinet.com/causality
Conclusion • We have found good correlation between causation and prediction under manipulations. • Several algorithms have demonstrated effectiveness of discovering causal relationships. • We still need to investigate what makes then fail in some cases. • We need to capitalize on the power of classical feature selection methods. clopinet.com/causality