240 likes | 414 Views
Learning Neighborhoods for Metric Learning. Jun Wang Adam Woznica Alexandros Kalousis. University of Applied Science Western Switzerland. University of Geneva. Outline. Metric learning Related works Our approach Experiments Conclusion.
E N D
Learning Neighborhoods for Metric Learning Jun Wang Adam Woznica Alexandros Kalousis University of Applied Science Western Switzerland University of Geneva
Outline • Metric learning • Related works • Our approach • Experiments • Conclusion
Computing the distance between instances is a fundamental problem in machine learning. Large distance Standard distance metric often doesn’t have such discriminative power. Small distance Small distance
Metric Learning • Metric Learning learns a distance function to reflect the given supervised information. • The most popular metric learning approach learns a Mahalanobis distance metric. • The given information is often represented as constraints involving in the learning process. a) Large margin triplet: b) (dis-)similar pairs:
Metric Learning Information representation How to extract these constraints ? Given supervised information Which type of constraints to use? Metric Learning What kinds of objective to optimize for these constraints?
Outline • Metric learning • Related works • Our approach • Experiments • Conclusion
Related Works • Here we only focus on supervised metric learning for nearest neighbor classification.
Large Margin Metric Learning LMNN [Weinberger et al. 05],BoostMetric [Shen et al. 09], SML [Ying et al. 09] Large margin loss Regularization Target neighbors are predefined and are not changed in the learning process. Target neighbor • Define target neighbors for each instance by k same class nearest neighbor in Euclidean space • Minimize distance from each point to its target neighbors . • Push away imposters. • Maintain a local margin between target neighbors and impostors. • State of the art predictive performance.
Information-Theoretic Metric Learning ITML [Davis et al. 07],Bk-means [Wu et al. 09] • Mahalanobis metric is related to the inverse covariance matrix in multivariate Gaussian distribution. • Metric Learning is formulated asminimizing the differential relative entropy between two multivariate Gaussians under pairwise constraints. • Efficient learning algorithm. pairwise constraints Prior metric Pairwise constraints are randomly predefined and not changed in the learning process as well.
Stochastic Metric Learning NCA[Goldberger et al. 04], MCML [Globerson and Roweis. 05] • Stochastic nearest neighbor: • NCA: • minimize the leave-one-out nearest neighbor error. KL divergence • MCML: • collapse same class samples into one point • put different class samples far away to each other • The stochastic nearest neighbor is computationally expensive. • target neighbor and impostors are learned implicitly via LOO error minimization.
Motivation • Except NCA, there is no methods learning the target neighbors. • Target neighbor is crucial for metric learning. Can we learn the target neighbor? Yes
Outline • Metric learning • Related works • Our approach • Experiments • Conclusion
Reformulation • Large Margin Metric Learning This is a general formulation of many metric learning methods. indicates the predefined target neighbor relation represent the loss induced by same class pair instance .
Methodology • Learning the target neighbor together with distance metric. • It makes sense to minimize here, as represent the loss induced by same class pair instance • Minimizing over P favors local target neighbors. • The difference between and allows to assign instances in sparse regions less target neighbors and instances in dense regions more target neighbors.
Optimization • Alternative Optimization • Fixing , learning is a standard metric learning problem. • Fixing , learning is a linear programming problem with integer optimal solutions. Proof: showing constraint matrix is a totally unimodular matrix.
Outline • Metric learning • Related works • Our approach • Experiments • Conclusion
Experiments • Does learning neighborhoods improve the classification accuracy?
Setup • Comparison methods • LMNN: predefined neighborhoods • NCA: implicitly neighborhoods learning • LN-LMNN: explicitly neighborhoods learning with tuning. • 1-NN • Parameter setting • LMNN: predefined 3 target neighbors for each instance. • LN-LMNN : 2-fold cv to select with
Examined Datasets 5 small and 7 large datasets.
LN-LMNN achieves better accuracy than LMNN (NCA) in 4 (3) out of 5 datasets.
LN-LMNN achieves better accuracy than LMNN in 6 out of 7 datasets. • NCA cannot be scaled up. More experimental results and comparison methods are described in the paper.
Conclusion and future works • In this work, we present a simple general learning neighborhood method for metric learning. • Learning neighborhoods does improve the predictive performance. • A more theoretical motivated problem formulation. • Learning neighborhood in the semi-supervised problem setting, e.g. graph-based semi-supervised learning.