230 likes | 378 Views
KDD-Cup 2004. Chairs: Rich Caruana & Thorsten Joachims Web Master++: Lars Backstrom Cornell University. KDD-Cup Tasks. Goal: Optimize learning for different performance metrics Task1: Particle Physics Accuracy Cross-Entropy ROC Area SLAC Q-Score Task2: Protein Matching
E N D
KDD-Cup 2004 Chairs: Rich Caruana & Thorsten Joachims Web Master++: Lars Backstrom Cornell University
KDD-Cup Tasks • Goal: Optimize learning for different performance metrics • Task1: Particle Physics • Accuracy • Cross-Entropy • ROC Area • SLAC Q-Score • Task2: Protein Matching • Squared Error • Average Precision • Top 1 • Rank of Last
Competition Participation • Timeline • April 28: tasks and datasets available • July 14: submission of predictions • Participation • 500+ registrants/downloads • 102 teams submitted predictions • Physics: 65 submissions • Protein: 59 submissions • Both: 22 groups • Demographics • Registrations from 49 Countries (including .com) • Winners from China, Germany, India, New Zealand, USA • Winners half from companies, half from universities
Task 1: Particle Physics • Data contributed by Charles Young et al, SLAC (Stanford Linear Accelerator) • Binary classification: distinguishing B from B-Bar particles • Balanced: 50-50 B/B-Bar • 78 features (most real-valued) describing track • Some missing values • Train: 50,000 cases • Test: 100,000 cases
Task 1: Particle Physics Metrics • 4 performance metrics: • Accuracy: had to specify threshold • Cross-Entropy: probabilistic predictions • ROC Area: only ordering is important • SLAC Q-Score: domain-specific performance metric from SLAC • Participants submit separate predictions for each metric • About half of participants submitted different predictions for different tasks • Winner submitted four sets of predictions, one for each task • Calculate performance using PERF software we provided to participants
Determining the Winners • For each performance metric • Calculate performance using same PERF software available to participants • Rank participants by performance • Honorable mention for participant ranked first • Overall winner is participant with best average rank across all metrics
Task 1: Physics Winners Christophe Lambert (Golden Helix Inc.): 3rd place overall (out of 65) Lalit Wangikar et al. (Inductis Inc.): 2nd place overall, HM Acc David Vogels et al. (MEDai Inc./University of Central Florida): 1st place overall, HM ROC, HM Cross-Entropy, HM SLQ
Bootstrap Analysis of Results • How much does selection of winner depend on specific test set (100k)? • Algorithm: • Repeat many times: • Take 100k bootstrap sample (with replacement) from test set • Evaluate performance on bootstrap sample and re-rank participants • What is probability of winning/placing?
Physics Winners: Bootstrap Analysis • 1000 bootstrap samples
Task 2: Protein Matching • Data contributed by Ron Elber, Cornell University • Finding homologous proteins (structural similarity) • 74 real-valued features describing match between two proteins • Data comes in blocks • Unbalanced: typically < 10 homologs (+) per block of 1000 • Train: 153 Proteins (145,751 cases) • Test: 150 Proteins (139,658 cases)
Task 2: Protein Matching Metrics • Four performance metrics: • Mean Squared Error: probabilistic predictions • Mean Average Precision: only ordering within each block is important • Mean Top 1: best predicted match is true homolog in each block • Mean Rank of Last: finding all homologs • Again participants submitted separate predictions for each metric • Again, about half of participants submitted multiple sets of predictions • 19/20 top participants submitted multiple sets of predictions • Optimizing to each metric separately helped more on Protein than on Physics
Task 2: Protein Winners Katharina Morik et al. (University of Dortmund): HM Rank Last David Vogel et al. (Aimed / University of Central Florida): 3rd place overall, HM Top1 Yan Fu et al. (Inst. of Comp. Tech., Chinese Academy of Sci.): 2nd place overall, HM Squared Error, HM Average Precision Bernhard Pfahringer (University of Waikato): 1st place overall
Protein Winners: Bootstrap Analysis • 10,000 bootstrap samples
Does Optimizing to Each Metric Help? • About half of participants submitted different predictions for each metric • Among winners: • Some evidence that top performers benefit from optimizing to each metric • Some metrics incompatible: e.g., optimizing to RMS hurts APR
How to Win KDD-Cup 2005: Collaborate • Ensemble that averages predictions of best participants
How to Win KDD-Cup 2005: Collaborate • Ensemble that averages predictions of best participants
Closing • Data and all results available online:http://kodiak.cs.cornell.edu/kddcup • PERF software download: http://www.cs.cornell.edu/~caruana • Thanks to: • Web Master++: Lars Backstrom (Cornell) • Physics Data: Charles Young (SLAC) • Protein Data: Ron Elber (Cornell) • PERF: Alex Niculescu (Cornell), Filip Radlinski (Cornell), Claire Cardie (Cornell), … • Thanks to participants who found bugs in the PERF software: • Chinese Academy of Sciences • University of Dortmund • And of course, thanks to everyone who participated!