240 likes | 403 Views
ROC & AUC, LIFT. ד"ר אבי רוזנפלד. Introduction to ROC curves. ROC = R eceiver O perating C haracteristic Started in electronic signal detection theory (1940s - 1950s) Has become very popular in biomedical applications, particularly radiology and imaging גם בשימוש בכריית מידע.
E N D
ROC & AUC, LIFT ד"ר אבי רוזנפלד
Introduction to ROC curves • ROC = Receiver Operating Characteristic • Started in electronic signal detection theory (1940s - 1950s) • Has become very popular in biomedical applications, particularly radiology and imaging • גם בשימוש בכריית מידע
False Positives / Negatives Confusion matrix 1 Confusion matrix 2 FN Actual Actual FP Predicted Predicted Precision (P) = 20 / 50 = 0.4 Recall (P) = 20 / 30 = 0.666 F-measure=2*.4*.666/1.0666=.5
Different Cost Measures • The confusion matrix (easily generalize to multi-class) • Machine Learning methods usually minimize FP+FN • TPR (True Positive Rate): TP / (TP + FN) = Recall • FPR (False Positive Rate): FP / (TN + FP) = Precision
Specific Example People without disease People with disease Test Result
Call these patients “negative” Call these patients “positive” Threshold Test Result
Call these patients “negative” Call these patients “positive” Some definitions ... True Positives Test Result without the disease with the disease
Call these patients “negative” Call these patients “positive” False Positives Test Result without the disease with the disease
Call these patients “negative” Call these patients “positive” True negatives Test Result without the disease with the disease
Call these patients “negative” Call these patients “positive” False negatives Test Result without the disease with the disease
Moving the Threshold: left ‘‘-’’ ‘‘+’’ Test Result without the disease with the disease
ROC curve 100% True Positive Rate (Recall) 0% 100% 0% False Positive Rate (1-specificity)
Area under ROC curve (AUC) • מדד כללי • השטח מתחת לגרךROC • 0.50 הוא מחירה רנדומאלי, 1.0 הוא מושלם.
AUC for ROC curves 100% 100% 100% 100% True Positive Rate True Positive Rate True Positive Rate True Positive Rate 0% 0% 0% 0% 100% 100% 100% 100% 0% 0% 0% 0% False Positive Rate False Positive Rate False Positive Rate False Positive Rate AUC = 100% AUC = 50% AUC = 90% AUC = 65%
Lift Charts • X axis is sample size: (TP+FP) / N • Y axis is TP 80% of responses for 40% of cost Lift factor = 2 Model 40% of responses for 10% of cost Lift factor = 4 Random
Lift factor Lift Value Sample Size
לחצן ימני על מודל ואזCost / Benefit Analysis for Wood
אפשר לשנות את הסף וגם לראות את הCONFUSION MATRIX