1 / 20

Considering Cost Asymmetry in Learning Classifiers

Considering Cost Asymmetry in Learning Classifiers . by Bach, Heckerman and Horvitz. Presented by Chunping Wang Machine Learning Group, Duke University May 21, 2007. Outline. Introduction SVM with Asymmetric Cost SVM Regularization Path ( Hastie et al., 2005 ) Path with Cost Asymmetry

tamma
Download Presentation

Considering Cost Asymmetry in Learning Classifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Considering Cost Asymmetry in Learning Classifiers by Bach, Heckerman and Horvitz Presented by Chunping Wang Machine Learning Group, Duke University May 21, 2007

  2. Outline • Introduction • SVM with Asymmetric Cost • SVM Regularization Path (Hastie et al., 2005) • Path with Cost Asymmetry • Results • Conclusions

  3. Introduction (1) Binary classification real-valued predictors binary response A classifier could be defined as based on a linear decision function Parameters

  4. Introduction (2) • Two types of misclassification: • false negative: cost • false positive: cost Expected cost: In terms of 0-1 loss function Real loss function but Non-convex Non-differentiable

  5. Introduction (3) Convex loss functions – surrogates for the 0-1 loss function (for training purpose)

  6. Introduction (4) Empirical cost given n labeled data points Objective function asymmetry regularization Since convex surrogates of the 0-1 loss function are used for training, the cost asymmetries for training and testing are mismatched. Motivation: efficiently look at many training asymmetries even if the testing asymmetry is given.

  7. SVM with Asymmetric Cost (1) hinge loss SVM with asymmetric cost where

  8. SVM with Asymmetric Cost (2) The Lagrangian with dual variables Karush-Kuhn-Tucker (KKT) conditions

  9. SVM with Asymmetric Cost (3) The dual problem where A quadratic optimization problem given a cost structure Computation will be intractable for the whole space Following the SVM regularization path algorithm (Hastie et al., 2005), the authors deal with (1)-(3) and KKT conditions instead of the dual problem.

  10. SVM Regularization Path (1) • Define active sets of data points: • Margin: • Left of margin: • Right of margin: KKT conditions SVM regularization path The cost is symmetric and thus searching is along the axis.

  11. SVM Regularization Path (2) Initialization ( ) Consider sufficiently large (C is very small), all the points are in L with Decrease Remain One or more positive and negative examples hit the margin simultaneously

  12. SVM Regularization Path (3) Initialization ( ) Define The critical condition for first two points hitting the margin For , this initial condition keeps the same except the definition of .

  13. SVM Regularization Path (4) • The path: decrease , changes only for except that one of the following events happens • A point from L or R has entered M; • A point in M has left the set to join either R or L consider only the points on the margin where is some function of , Therefore, the for points on the margin proceed linearly in ; the function changes in a piecewise-inverse manner in

  14. SVM Regularization Path (4) • The path: decrease , changes only for except that one of the following events happens • A point from L or R has entered M; • A point in M has left the set to join either R or L consider only the points on the margin where is some function of , Therefore, the for points on the margin proceed linearly in ; the function changes in a piecewise-inverse manner in .

  15. SVM Regularization Path (5) • Update regularization • Update active sets and solutions • Stopping condition • In the separable case, we terminate when L become empty; • In the non-separable case, we terminate when for all the possible events

  16. Path withCost Asymmetry (1) Exploration in the 2-d space Path initialization: start at situations when all points are in L Follow the updating procedure in the 1-d case along the line Regularization is changing and the cost asymmetry is fixed. Among all the classifiers, find the best one , given user’s cost function Paths starting from

  17. Path withCost Asymmetry (2) Produce ROC Collecting R lines in the direction of , we can build three ROC curves

  18. Results (1) • For 1000 testing asymmetries , three methods are compared: • “one” – take as training cost asymmetry; • “int” – vary the intercept of “one” and build an ROC, then select the optimal classifier; • “all” – select the optimal classifier from the ROC obtained by varying both the training asymmetry and the intercept. • Use a nested cross-validation: • The outer cross-validation: produce overall accuracy estimates for the classifier; • The inner cross-validation: select optimal classifier parameters (training asymmetry and/or intercept).

  19. Results (2)

  20. Conclusions • An efficient algorithm is presented to build ROC curves by varying the training cost asymmetries for SVMs. • The main contribution is generalizing the SVM regularization path (Hastie et al., 2005) from a 1-d axis to a 2-d plane. • Because of the usage of a convex surrogate, using the testing asymmetry for training leads to non-optimal classifier. • Results show advantages of considering more training asymmetries.

More Related