110 likes | 246 Views
Cross Validation False Negatives / Negatives. ד"ר אבי רוזנפלד. הגדרות. False Positives / Negatives. Confusion matrix 1. Confusion matrix 2. FN. Actual. Actual. FP. Predicted. Predicted. Precision (P) = 20 / 50 Recall (P) = 20 / 30. Example. Precision (A) = 50% (500/1000)
E N D
Cross ValidationFalse Negatives / Negatives ד"ר אבי רוזנפלד
False Positives / Negatives Confusion matrix 1 Confusion matrix 2 FN Actual Actual FP Predicted Predicted Precision (P) = 20 / 50 Recall (P) = 20 / 30
Example Precision (A) = 50% (500/1000) Recall = 83% (500/600) Accuracy = 95% (10500/11100)
עוד דוגמא עם כמה קטגוריות 27 animals — 8 cats, 6 dogs, and 13 rabbits Confusion Matrix: Predicted class Actual class Cat Dog Rabbit Cat 5 3 0 Dog 2 3 1 Rabbit 0 2 11 יש 3 False Negatives של חתולים שמסויגים כמו כלבים, ויש 2 False Positives של כלבים המסווגים כמו ארנבות. ויש 2 False Positive של ארנבות שמסויגים ככלבים, ו0 False Positives של חתולים. Recall (Dog) = 3/6, Precision(Dog) = 3/8
דוגמא מWEKA === Summary === Correctly Classified Instances 320 66.39 % Incorrectly Classified Instances 162 33.61 % === Detailed Accuracy By Class === Precision Recall 0.664 1 0 0 === Confusion Matrix === a b <-- classified as 320 0 | a = FALSE -> Precision (A) = 320/582), Recall = 320/320 162 0 | b = TRUE -> Precision (B) = Recall (B) = 0
דוגמא נוספת (מהחיות – (zoo.arff Correctly Classified Instances 93 92.0792 % Incorrectly Classified Instances 8 7.9208 % === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 1 0 1 1 1 1 mammal 1 0 1 1 1 1 bird 0.6 0.01 0.75 0.6 0.667 0.793 reptile 1 0.011 0.929 1 0.963 0.994 fish 0.75 0 1 0.75 0.857 0.872 amphibian 0.625 0.032 0.625 0.625 0.625 0.92 insect 0.8 0.033 0.727 0.8 0.762 0.986 invertebrate === Confusion Matrix === a b c d e f g <-- classified as 41 0 0 0 0 0 0 | a = mammal 0 20 0 0 0 0 0 | b = bird 0 0 3 1 0 1 0 | c = reptile 0 0 0 13 0 0 0 | d = fish 0 0 1 0 3 0 0 | e = amphibian 0 0 0 0 0 5 3 | f = insect 0 0 0 0 0 2 8 | g = invertebrate Recall (Invertebrate) = 8/10 = 0.8, Precision = 8/11 = 0.727
10-fold cross-validation (one example of K-fold cross-validation) • 1. Randomly divide your data into 10 pieces, 1 through k. • 2. Treat the 1st tenth of the data as the test dataset. Fit the model to the other nine-tenths of the data (which are now the training data). • 3. Apply the model to the test data (e.g., for logistic regression, calculate predicted probabilities of the test observations). • 4. Repeat this procedure for all 10 tenths of the data. • 5. Calculate statistics of model accuracy and fit (e.g., ROC curves) from the test data only.