290 likes | 459 Views
Scaling multi-class Support Vector Machines using inter-class confusion. Author:Shantanu Sunita Sarawagi Soumen Chakrabarti Advisor:Dr Hsu Graduate:ching-wen Hong. Content. 1.Motivation 2.Objective
E N D
Scaling multi-class Support Vector Machines using inter-class confusion Author:Shantanu Sunita Sarawagi Soumen Chakrabarti Advisor:Dr Hsu Graduate:ching-wen Hong
Content • 1.Motivation • 2.Objective • 3.Introduction: (1).SVM (2).Using SVM to solve multi-class problems. (3).Present a method in this paper. • 4.OUR APPROACH (1).Hierarchical Approach (2).The GraphSVM algorithm • 5.Experimental evaluation • 6.Conclusion • 7.Personal opinion
Motivation • Solve multi-class problems.
Objective • SVM excel at two-class discrinative learning problems. The accuracy of SVM is high. • SVM is difficult to solve multi-class problems. Because training time is long. • The naïve Bayes(NB) classifier is much faster than SVM in training time. • We propose a new technique for multi-way classification which exploits the accuracy of SVM and the speed of NB classifiers.
Introduction • 1.SVM: • Input: a training set S={(x1,y1),…, (xN,yN)},xi is a vector, yi=1,-1 • Output: a classifier f(x)=W.X+b • For example: Medical diagnosis • Xi =(age,sex,blood,…,genome,…) • Yi indicates the risk of cancer.
2.Using SVM to solve multi-class problems. • 1. “one-vs-others” approach • For each of the N classes, We construct a one-others (yes/no) SVM for that class alone. • The winning SVM is the one which says yes, and whose margin is largest among all SVMs.
Using SVM to solve multi-class problems • 2.Accumulated votes approach • To construct SVMs between all possible pairs of classes. • The winning class has the largest number of accumulated votes.
3.Present a method in this paper. • 1.Using scalability of NB classifiers w.r.t. number of classes and accuracy of SVMs. • The first stage :Using multi-class NB classifier to a confusion matrix. • The second stage :Using SVM with the “one-vs-others” approach.
OUR APPROACH • Confusion matrix: using NB and held-out validation dataset.
Hierarchical Approach • Top-level( L1) classifier(NB or SVM) to discriminate amongst the top-level clusters of labels. • Second-level(L2) we build multi-class SVMs within each cluster of classes.
Evaluation of the hierarchical approach • We compare four methods: • MCNB(one-vs-others) • MCSVM(one-vs-others) • Hier-NB (L1:NB,L2:NB), • Hier-SVM (L1:NB,L2:SVM)
Evaluation of the hierarchical approach • NB-L2( 89.01%),combining with the NB-L1 (93.56%),Hier-NB (83.28%),MCNB (85.27%) • SVM-L2 with NB-L1(92.04%), Hier-SVM(86.12%),MCSVM(89.66%) • The main reason for the low accuracy of the hierarchical approaches is the compounding of errors at the two levels. • This led us to design a new algorithm GraphSVM.
The GraphSVM algorithm • 1.The confusion matrix obtained by a fast multi-class NB classifier M1, • For each class i,F(i)={mis-classified as class i },a threshold t% mis-classified. • In Figure1 , I=alt.atheism,t=3%,F(alt.atheism)={talk.religion.misc,soc.religion.christian}. • 2.Train a multi-class classifier M2(i) to distinguish among the class{i}U F{i}.
.Experimental evaluation • 1.Datasets • 20-newsgroups:18828 news wire articles from 20 Usenet group.We randomly chose 70% of the documents for training and 30% for testing. • Reuter-21578:135 classes,8819 training documents and 1887 test documents.
Conclusion • GraphSVM is accurate and efficient in multi-classes problem. • GraphSVM outerforms SVMs w.r.t. training time and memory requirements. • GraphSVM is very simple to understand and requires negligible coding,but it is useful to deal with very large classifiers(ten of thousands of classses and millions of instances).
Personal opinion • GraphSVM may be worse is high positive value of the threshold t. • It is nice that the accurate of GraphSVM can not affected by the threshold t.