100 likes | 209 Views
A support Vector Method for Multivariate performance Measures. Author: Thorsten Joachims (ICML’05) Presenter: Lei Tang. Motivation. Current classifier focus on error-rate, how to optimize it directly for different performance measures? Precision, recall, F-measure etc. Existing Approach.
E N D
A support Vector Method for Multivariate performance Measures Author: Thorsten Joachims (ICML’05) Presenter: Lei Tang
Motivation • Current classifier focus on error-rate, how to optimize it directly for different performance measures? • Precision, recall, F-measure etc.
Existing Approach • Accurately estimate the probabilities of class membership of each example. (Difficult) • Optimize tractable different variants. But for non-linear measure(F-measure), extensive CV is required. • Directly optimize the measure like ROCArea. But non on F-measure.
Reformulation Sample-based Loss • Given training examples and test examples S’, our goal is to minimize • Decompose the loss function linearly: Empirical loss: Example-Based Loss
SVM • Original SVM: • Multivariate SVM: Here, is a function that returns a feature vector of x,y Prediction:
Problems Too many constraints!!!! N samples, k class labels, then |Y|=k^N. Do we really need to include all the constraints?
Algorithm Constraint Selection
Contingency Table • Still impractical!! We have to calculate • Contingency table N samples, how many different tables?
Algorithm for argmax Given a table, • Exhaustive search all the possible contingency tables and get the maximum. What should the assignment be?
Various Loss • F-measure: • Precision /Recall (Just look at top k data points) • Precision/Recall Break-Even Point The search space is reduced as a+b=a+c