1 / 18

Analysis and visualization of classifier performance Comparison under Imprecise CLASS AND COST DISTRIBUTIONS

F. Provost and T. Fawcett. Analysis and visualization of classifier performance Comparison under Imprecise CLASS AND COST DISTRIBUTIONS. Ramazan Bitirgen CSL - ECE. Confusion Matrix. Introduction. Data mining requires: Experiments with a wide variety of learning algorithms

agatha
Download Presentation

Analysis and visualization of classifier performance Comparison under Imprecise CLASS AND COST DISTRIBUTIONS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. F. Provost and T. Fawcett Analysis and visualization of classifier performanceComparison under Imprecise CLASS AND COST DISTRIBUTIONS Ramazan Bitirgen CSL - ECE

  2. Confusion Matrix Bitirgen - CS678

  3. Introduction • Data mining requires: • Experiments with a wide variety of learning algorithms • Using different algorithm parameters • Varying output threshold values • Using different training regimens • Using accuracy alone is inadequate because: • Class distributions are skewed • Misclassification (FP, FN) costs are not uniform Bitirgen - CS678

  4. Class Distributions -Problems with Acc. • … assumes that class distribution among examples is constant and relatively balanced (-which is not the case in real life-) • Classifiers are generally used to scan ‘large number of normal entities’ to find ‘small number of unusual ones’ • Looking for defrauded customers • Checking an assembly line • Skews of 106 were reported (Clearwater & Stern 1991) Bitirgen - CS678

  5. Misclassification Costs -Problems with Acc. • ‘Equal error costs’ does not hold in real life problems • Disease tests, fraud detection… • Instead of maximizing the accuracy, we need to minimize the error cost. Cost = FP•c(Y,n) + FN•c(N,p) Bitirgen - CS678

  6. Bitirgen - CS678

  7. ROC Plot and ROC Area • Receiver Operator Characteristic • Developed in WWII to statistically model “false positive” and “false negative” detections of radar operators • Becoming more popular in ML and standard measure in medicine and biology • However does poor job on deciding the choice of classifiers Bitirgen - CS678

  8. ROC graph of four classifiers Informally a point in ROC space is better than the other if it is to the northwest. Bitirgen - CS678

  9. Bitirgen - CS678

  10. Iso-performance Lines • Expected cost of a classification by a classifier (FP,TP): • Therefore, two points have the same performance if • Iso-perf. line: All classifiers corresponding to points on the line have the same expected cost. Bitirgen - CS678

  11. ROC Convex Hull • If a point is not on the convex hull the classifier represented by that point cannot be optimal. • In this example B and D cannot be optimal because none or their points are on the convex hull. Bitirgen - CS678

  12. How to use the ROC Convex Hull • p(n):p(p) = 10:1 • Scenario A: • c(N,p) = c(Y,n) •  m(iso_perf) = 10 • Scenario B: • c(N,p) = 100 • c(Y,n) •  m(iso_perf) = 0.1 Bitirgen - CS678

  13. Adding New Classifiers • Adding new classifiers may or may not extend the existing hull. • E may be optimal under some circumstances since it extends the hull • F and G cannot be optimal Bitirgen - CS678

  14. What if distributions & costs are unknown? • ROC convex hull gives us an idea about all classifiers that may be optimal under any conditions. • With complete information the method identifies the optimal classifiers. • In between ? Bitirgen - CS678

  15. Sensitivity Analysis • Imprecise distribution info defines a range of slopes for iso-perf lines. • p(n):p(p) = 10:1 • Scenario C: • $5 < c(Y,n) < $10 • $500 < c(N,p) < $1000 • 0.05 < m(iso_perf) < 0.2 Bitirgen - CS678

  16. Sensitivity Analysis - 2 • Imprecise distribution info defines a range of slopes for iso-perf lines. • p(n):p(p) = 10:1 • Scenario D: • 0.2 < m(iso_perf) < 2 Bitirgen - CS678

  17. Sensitivity Analysis - 3 • Can “do nothing” strategy be better than any of the available classifiers? Bitirgen - CS678

  18. Conclusion • Accuracy alone as a performance metric is incapable for various reasons • ROC plots give more accurate information about the performance of classifiers • ROC convex hull method • Is an efficient solution to the problem of comparing multiple classifiers in imprecise environments • Allows us to incorporate new classifiers easily • Allows us to select the classifiers that are potentially optimal Bitirgen - CS678

More Related