330 likes | 482 Views
In Defense of One-Vs-All Classification. Ryan Rifkin and Aldebaro Klautau Journal of Machine Learning Research, Volume 5, (December 2004), Pages: 101 – 141. Presented by Shuiwang Ji Machine Learning Lab at CSE Center for Evolutionary Functional Genomics The Biodesign Institute
E N D
In Defense of One-Vs-All Classification Ryan Rifkin and Aldebaro Klautau Journal of Machine Learning Research, Volume 5, (December 2004), Pages: 101 – 141. Presented by Shuiwang Ji Machine Learning Lab at CSE Center for Evolutionary Functional Genomics The Biodesign Institute Part of this slides are taken from: http://www.mit.edu/~9.520/Classes/class08.html
Main thesis • “One-against-rest scheme is extremely powerful, producing results that are often at least as accurate as other methods.” • “Experimental evidence of the superiority of the proposed methods over a simple One-against-rest scheme is improperly controlled or measured.”
Outline • Single machine approaches; • Error correcting code approaches; • Tree-structured approaches (NOT in the paper); • Experiments.
Watson & Watkins (WW)(1998) • Binary-class: Learn one function. Penalize each machine separately based on the margin violations; • Multi-class: Pay a penalty based on the relative values output by the machines.
Watson & Watkins (WW)(1998) • Learn N functions. If a point x is in class i, make (k-1)*n
Watson & Watkins (WW)(1998) • Too many constraints and slack variables (k-1)*n; • Not easy to decompose (not scalable); • Experimental setup is problematic.
Crammer & Singer (2001) • Watson & Watkins: paying each class for which • Crammer & Singer: Penalize for the largest
Crammer & Singer (2001) • Watson & Watkins: • Crammer & Singer n(k-1) n
Crammer & Singer (2001) • Fewer slacks (compared to Watson & Watkins ); • Can be decomposed (more scalable); • Many tricks are developed and implemented for efficient training; • C and R source codes available: http://www.cis.upenn.edu/~crammer/code/MCSVM/MCSVM_1_0.tar.gz R (http://www.r-project.org/ ) kernlab package
Lee, Lin, Wahba, Analysis • Like the WW formulation, this formulation is big, and no decomposition method is provided; • This is an asymptotic analysis. It requires and and no rates are provided. But asymptotically, density estimation will allow us to recover the optimal Bayes rule.
Outline • Single machine approaches; • Error correcting code approaches; • Tree-structured approaches; • Experiments.
Codeword Meta-classifier Error-Correcting Code (ECC) Dietterich & Bakiri (1995) 0 1 0 0 0 0 0 0 0 0 Source: Dietterich and Bakiri (1995)
Special cases of ECC Source: http://www-cse.ucsd.edu/users/elkan/254spring01/aldebaro1.pdf
Outline • Single machine approaches; • Error correcting code approaches; • Tree-structured approaches; • Experiments.
Large Margin Directed Acyclic Graph (DAG) • Identical to one-against-one at training time; • At test time, DAG is used to determine which classifiers to test on a given point; • Classes i and j are compared, whichever class achieves lower score is removed from further consideration; • Repeat N-1 time, only one class remained.
Large Margin DAGs for Multiclass Classification Source: Platt et al. (2000)
Margin tree (Tibshirani and Hastie 2006) • SVM is constructed for each pair of classes to compute pair-wise margins; • Agglomerative clustering uses the pair-wise margins as distances to construct the hierarchical structure bottom up. • Three approaches: Greedy, single linkage, and complete linkage.
Margin tree Source: Tibshirani and Hastie (2006)
Outline • Single machine approaches; • Error correcting code approaches; • Tree-structured approaches; • Experiments (Compare five ECC approaches).
Observations • In nearly all cases, the results of compared methods are very close; • In majority of experiments, 0 is in the confidence interval, meaning the classifiers are not statistically different;
Implementations in R • e1071: one-against-one (LIBSVM) • kernlab: one-against-one, Crammer & Singer, Weston & Watkins • klaR: one-against-rest (SVMlight) • marginTree: http://www-stat.stanford.edu/~tibs/marginTree_1.00.zip
Q & A Thank you!