1 / 29

Scaling multi-class Support Vector Machines using inter-class confusion

Scaling multi-class Support Vector Machines using inter-class confusion. Author:Shantanu Sunita Sarawagi Soumen Chakrabarti Advisor:Dr Hsu Graduate:ching-wen Hong. Content. 1.Motivation 2.Objective

fagan
Download Presentation

Scaling multi-class Support Vector Machines using inter-class confusion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaling multi-class Support Vector Machines using inter-class confusion Author:Shantanu Sunita Sarawagi Soumen Chakrabarti Advisor:Dr Hsu Graduate:ching-wen Hong

  2. Content • 1.Motivation • 2.Objective • 3.Introduction: (1).SVM (2).Using SVM to solve multi-class problems. (3).Present a method in this paper. • 4.OUR APPROACH (1).Hierarchical Approach (2).The GraphSVM algorithm • 5.Experimental evaluation • 6.Conclusion • 7.Personal opinion

  3. Motivation • Solve multi-class problems.

  4. Objective • SVM excel at two-class discrinative learning problems. The accuracy of SVM is high. • SVM is difficult to solve multi-class problems. Because training time is long. • The naïve Bayes(NB) classifier is much faster than SVM in training time. • We propose a new technique for multi-way classification which exploits the accuracy of SVM and the speed of NB classifiers.

  5. Introduction • 1.SVM: • Input: a training set S={(x1,y1),…, (xN,yN)},xi is a vector, yi=1,-1 • Output: a classifier f(x)=W.X+b • For example: Medical diagnosis • Xi =(age,sex,blood,…,genome,…) • Yi indicates the risk of cancer.

  6. 1.Linear SVM

  7. Linear SVM

  8. Linear SVM

  9. Linear SVM

  10. Linear SVM

  11. 2.Using SVM to solve multi-class problems. • 1. “one-vs-others” approach • For each of the N classes, We construct a one-others (yes/no) SVM for that class alone. • The winning SVM is the one which says yes, and whose margin is largest among all SVMs.

  12. Using SVM to solve multi-class problems • 2.Accumulated votes approach • To construct SVMs between all possible pairs of classes. • The winning class has the largest number of accumulated votes.

  13. 3.Present a method in this paper. • 1.Using scalability of NB classifiers w.r.t. number of classes and accuracy of SVMs. • The first stage :Using multi-class NB classifier to a confusion matrix. • The second stage :Using SVM with the “one-vs-others” approach.

  14. OUR APPROACH • Confusion matrix: using NB and held-out validation dataset.

  15. OUR APPROACH

  16. Hierarchical Approach • Top-level( L1) classifier(NB or SVM) to discriminate amongst the top-level clusters of labels. • Second-level(L2) we build multi-class SVMs within each cluster of classes.

  17. Evaluation of the hierarchical approach • We compare four methods: • MCNB(one-vs-others) • MCSVM(one-vs-others) • Hier-NB (L1:NB,L2:NB), • Hier-SVM (L1:NB,L2:SVM)

  18. Evaluation of the hierarchical approach

  19. Evaluation of the hierarchical approach

  20. Evaluation of the hierarchical approach • NB-L2( 89.01%),combining with the NB-L1 (93.56%),Hier-NB (83.28%),MCNB (85.27%) • SVM-L2 with NB-L1(92.04%), Hier-SVM(86.12%),MCSVM(89.66%) • The main reason for the low accuracy of the hierarchical approaches is the compounding of errors at the two levels. • This led us to design a new algorithm GraphSVM.

  21. The GraphSVM algorithm • 1.The confusion matrix obtained by a fast multi-class NB classifier M1, • For each class i,F(i)={mis-classified as class i },a threshold t% mis-classified. • In Figure1 , I=alt.atheism,t=3%,F(alt.atheism)={talk.religion.misc,soc.religion.christian}. • 2.Train a multi-class classifier M2(i) to distinguish among the class{i}U F{i}.

  22. .Experimental evaluation • 1.Datasets • 20-newsgroups:18828 news wire articles from 20 Usenet group.We randomly chose 70% of the documents for training and 30% for testing. • Reuter-21578:135 classes,8819 training documents and 1887 test documents.

  23. Overall comparison

  24. Scalability with number of classes

  25. Scalability with number of classes

  26. Scalability with training set size

  27. Effect of the threshold parameter

  28. Conclusion • GraphSVM is accurate and efficient in multi-classes problem. • GraphSVM outerforms SVMs w.r.t. training time and memory requirements. • GraphSVM is very simple to understand and requires negligible coding,but it is useful to deal with very large classifiers(ten of thousands of classses and millions of instances).

  29. Personal opinion • GraphSVM may be worse is high positive value of the threshold t. • It is nice that the accurate of GraphSVM can not affected by the threshold t.

More Related