Results of the Causality Challenge

Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon clopinet.com/causality

…your health? …climate changes? … the economy? Causal discovery What affects… Which actions will have beneficial effects? clopinet.com/causality

The system External agent Systemic causality clopinet.com/causality

Feature Selection Y X Predict Y from features X1, X2, … Select most predictive features. clopinet.com/causality

Y Y X Causation Predict the consequences of actions: Under “manipulations” by an external agent, some features are no longer predictive. clopinet.com/causality

Challenge Design clopinet.com/causality

Available data • A lot of “observational” data. Correlation  Causality! • Experiments are often needed, but: • Costly • Unethical • Infeasible • This challenge, semi-artificial data: • Re-simulated data • Real data with artificial “probes” clopinet.com/causality

Challenge datasets Toy datasets Four tasks clopinet.com/causality

On-line feed-back clopinet.com/causality

Difficulties • Violated assumptions: • Causal sufficiency • Markov equivalence • Faithfulness • Linearity • “Gaussianity” • Overfitting (statistical complexity): • Finite sample size • Algorithm efficiency (computational complexity): • Thousands of variables • Tens of thousands of examples clopinet.com/causality

Evaluation • Fulfillment of an objective • Prediction of a target variable • Predictions under manipulations • Causal relationships: • Existence • Strength • Degree clopinet.com/causality

Setting • Predict a target variable (on training and test data). • Return the set of features used. • Flexibility: • Sorted or unsorted list of features • Single prediction or table of results • Complete entry = xxx0, xxx1, xxx2 results (for at least one dataset). clopinet.com/causality

Metrics • Results ranked according to the test set target prediction performance “Tscore”: • We also assess directly the feature set with a “Fscore”, not used for ranking. clopinet.com/causality

Toy Examples clopinet.com/causality

Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue LUCAS0: natural Car Accident Causality assessmentwith manipulations clopinet.com/causality

Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident Causality assessmentwith manipulations LUCAS1: manipulated clopinet.com/causality

Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident Causality assessmentwith manipulations LUCAS2: manipulated clopinet.com/causality

10 2 5 3 9 4 1 0 6 11 8 • Participants return: S=selected subset 7 11 4 1 2 3 (ordered or not). Goal driven causality • We define: • V=variables of interest • (e.g. MB, direct causes, ...) • We assess causal relevance: Fscore=f(V,S). clopinet.com/causality

Causality assessmentwithout manipulation? clopinet.com/causality

P1 P2 P3 PT Probes Using artificial “probes” Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder LUCAP0: natural Coughing Fatigue Car Accident clopinet.com/causality

Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue Car Accident P1 P2 P3 PT Probes Using artificial “probes” LUCAP1&2: manipulated clopinet.com/causality

Scoring using “probes” • What we can compute (Fscore): • Negative class = probes (here, all “non-causes”, all manipulated). • Positive class = other variables (may include causes and non causes). • What we want (Rscore): • Positive class = causes. • Negative class = non-causes. • What we get (asymptotically): Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal) clopinet.com/causality

Results clopinet.com/causality

Challenge statistics • Start: December 15, 2007. • End: April 30, 2000 • Total duration: 20 weeks. • Last (complete) entry ranked: Number of ranked entrants Number of ranked submissions clopinet.com/causality

REGED SIDO 1 1 0.9 0.9 0.8 0.8 0.7 0.7 Tscore Tscore 0.6 0.6 0.5 0.5 0 0 0.4 0.4 1 1 2 2 0.3 0.3 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Days into the challenge Days into the challenge MARTI CINA 1 1 0.9 0.9 0.8 0.8 0.7 0.7 Tscore Tscore 0.6 0.6 0.5 0.5 0 0 0.4 0.4 1 1 2 2 0.3 0.3 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Days into the challenge Days into the challenge Learning curves clopinet.com/causality

AUC distribution clopinet.com/causality

REGED clopinet.com/causality

SIDO clopinet.com/causality

CINA clopinet.com/causality

MARTI clopinet.com/causality

Pairwise comparisons clopinet.com/causality

Top ranking methods • According to the rules of the challenge: • Yin Wen Chang: SVM => best prediction accuracy on REGED and CINA. Prize: $400 donated by Microsoft. • Gavin Cawley: Causal explorer + linear ridge regression ensembles => best prediction accuracy on SIDO and MARTI. Prize: $400 donated by Microsoft. • According to pairwise comparisons: • Jianxin Yin and Prof. Zhi Geng’s group: Partial Orientation and Local Structural Learning=> best on Pareto front, new original causal discovery algorithm. Prize: free WCCI 2008 registration. clopinet.com/causality

Pairwise comparisons REGED SIDO MARTI CINA clopinet.com/causality

Conclusion • We have found good correlation between causation and prediction under manipulations. • Several algorithms have demonstrated effectiveness of discovering causal relationships. • We still need to investigate what makes then fail in some cases. • We need to capitalize on the power of classical feature selection methods. clopinet.com/causality

Results of the Causality Challenge

Results of the Causality Challenge

Presentation Transcript

College Of Public Speaking Corporate Challenge 2010 Results

Funnel of Causality

Causality

Causality for Beginners

The Case for Causality

The Complexity of Causality and Responsibility

Causality Matters

The Challenge of Challenge Funds

Achieving and Measuring Results Seminar III Measure Results of Challenge Funds

RESULTS OF THE WCCI 2006 PERFORMANCE PREDICTION CHALLENGE Isabelle Guyon

Causality

Ford Classification Challenge Results

Banff Challenge 2a Results

Causality challenge #2: Pot-Luck

Presentiment The retro-causality debate

V13: Causality

Causality

CSCC 2012 Challenge Results

CAUSALITY ANALYSIS

Causality

Results of the LHCb experiment Data Challenge 2004

Causality Project