120 likes | 131 Views
Learn how to evaluate numeric prediction models using ROC curves. Discover different methods for generating ROC curves and how to choose the appropriate one based on the size of your dataset. Explore the concept of the convex hull and its application in achieving any point on the curve. Additionally, understand the importance of classifier evaluation metrics such as precision, recall, sensitivity, and specificity.
E N D
Data MiningCSCI 307, Spring 2019Lecture 34 More on ROC Curves Evaluating Numeric Prediction
Cross-Validation and ROC Curves • Simple method of getting a ROC curve using cross-validation: • Collect probabilities for instances in test folds • Sort instances according to probabilities • This method is implemented in WEKA • However, this is just one possibility • Another possibility is to generate an ROC curve for each fold and average them
ROC Curves for Two Learning Schemes For a small, focused sample, use Method A For a larger one, use Method B In between, choose between A and B with appropriate probabilities
The Convex Hull Given two learning schemes we can achieve any point on the convex hull • TP and FP rates for scheme 1: t1 and f1 • TP and FP rates for scheme 2: t2 and f2 • If scheme 1 is used to predict 100 x q% of the cases and scheme 2 for the rest, then • TP rate for combined scheme: q x t1 + (1-q) x t2 • FP rate for combined scheme: q x f1 + (1-q) x f2
Summary: Classifier Evaluation Metrics Precision = 90/230 = 39.13% Recall = 90/300 = 30.00% • Classifier Accuracy (aka recognition rate): percentage of test set instances correctly classified Accuracy = (TP + TN)/All • Precision: exactness – what % of instances that the classifier labeled as positive are actually positive Precision = TP/(TP+FP) • Recall: completeness – what % of positive instances did the classifier label as positive?Recall = TP/(TP+FN) • Sensitivity: True Positive recognition rate Sensitivity = TP/P • Specificity: True Negative recognition rate Specificity = TN/N Recall == Sensitivity P = (TP+FN) N = (TN+FP) 5
5.9 Evaluating Numeric Prediction • Same strategies: independent test set, cross-validation, significance tests, etc. • Difference: error measures • Actual target values: a1 a2 …an,Predicted target values: p1 p2 … pn Most popular measure: mean-squared error (p1-a1)2+(p2-a2)2 …(pn-an)2 n • Mean-squared error is easy to manipulate mathematically Other Measures: • The root mean-squared error: • The mean absolute error is less sensitive to outliers • Sometimes relative error values are more appropriate (e.g. 10% for an error of 50 when predicting 500)
Improvement on the Mean How much does the scheme improve on simply predicting the average? • The relative squared error is: • The root relative squared error is: • Therelative absolute error is:
Correlation Coefficient Measures the statistical correlation between the predicted values and the actual values • Scale independent, between –1 and +1 • Good performance leads to large values.
Which Measure? • Best to look at all of them • Often it doesn’t matter Example:
Issues Affecting Model Selection Accuracy classifier accuracy: predicting class label Speed time to construct the model (training time) time to use the model (classification/prediction time) Robustness: handling noise and missing values Scalability: efficiency in disk-resident databases Interpretability understanding and insight provided by the model Other measures, e.g., goodness of rules, such as decision tree size or compactness of classification rules 12