1 / 22

Extreme Re-balancing for SVMs and other classifiers

Extreme Re-balancing for SVMs and other classifiers. Authors: Bhavani Raskutti & Adam Kowalczyk Telstra Croporation Victoria, Austalia. Presenter: Cui, Shuoyang 2005/03/02. Majority. ideal. Minority.

daphne
Download Presentation

Extreme Re-balancing for SVMs and other classifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extreme Re-balancing for SVMs and other classifiers Authors: Bhavani Raskutti & Adam Kowalczyk Telstra Croporation Victoria, Austalia Presenter: Cui, Shuoyang 2005/03/02

  2. Majority ideal Minority Imbalance makes the minority-classes samples farther from the true boundary than the majority-class samples. Majority-class samples dominate the penalty introduced by soft margin.

  3. Data Balancing • up/down samplings • No convincing evidence for how the balanced data sampled Imbalance-free algorithm design • Objective function should not be accuracy any longer Reference: Machine Learning from Imbalanced Data Sets 101 http://pages.stern.nyu.edu/~fprovost/Papers/skew.PDF

  4. In this paper Exploring the characters of two class learning and analyses situations with supervised learning. In the experiments offered later, comparing one-class learning with two class learning and list different forms of imbalance compensation.

  5. Two class discrimination to take examples from these two classes generate a model for discriminating them for many machine learning algorithms, the training data should include the example form two classes.

  6. When the data has heavily unbalanced representatives of these two class. • design re-balancing • ignore the large pool of negative examples • learn from positive examples only

  7. Why extreme re-balancing • Extreme imbalance in very high dimensional input spaces • Minority class consisting of 1-3% of the total data • Learning sample size is much below the dimensionality of the input space • Data site has more than 10,000 features

  8. The kernel machine The kernel machine is solved iteratively using the conjugate gradient method. Designing a kernel machine is to take a standard algorithm and massage it so that all references to the original data vectors x appear only in dot products ( xi; xj). Given a training sequence(xi,yi) of binary n—vectors and bipolar labels

  9. Two different cases of kernel machines used here

  10. Two forms of imbalance compensation • Sample balancing • Weight balancing

  11. Sample balancing 1:0------the case of 1-class learner using all of the negative examples 1:1------ the case of 2-class learner using all training examples 0:1------ the case of 1-class learner using all of the positive examples

  12. Weight balancing Using different values of the regularisation of the regulation constants for both the minority and majority class data B is a parameter called a balance factor

  13. ExperimentsReal world data collections • AHR-data • Reuters data

  14. AHR-data • Combined training and test data set • Each training instance labeled with “control”, “change” or “nc” • Convert all of the info from different files to a sparse matrix containing 18330 features

  15. Reuters data • A collection of 12902 documents • Each document has been converted to a vector of 20197 dimensional word-presence feature space

  16. AROC is used as performance measure AROC is the Area under the ROC Receiver operating characteristic (ROC) curves are used to describe andcompare the performance of diagnostic technology and diagnostic algorithms.

  17. Experiments with Real World Data • Impact of regularisation constant • Experiment with sample balancing • Experiments with weight balancing

  18. Impact of regularisation constant

  19. Experiment with sample balancing The AROC with 2-class learners is close to 1 for all categories indicating that this categorization problem is easy to learn

  20. Experiments with weight balancing 1.Test on AHRdata

  21. Experiments with weight balancing 2.Test on Reuters To observe the performance ouf 1-class and 2-class SVMs when the most features are moved

  22. The characters of test on Reuters • The accuracy of all classifiers is very high • SVM models start degenerating, the drop in performance for 2-class SVM is larger. • 1-class SVM models start outperforming 2-class models • Similar trends • AROC is always bigger than 0.5

More Related