240 likes | 469 Views
Kernel Methods for Weakly Supervised Mean Shift Clustering. Oncel Tuzel & Fatih Porikli Mitsubishi Electric Research Labs Peter Meer Rutgers University. Outline. Motivation Mean Shift Method Overview Kernel Mean Shift Constrained Kernel Mean Shift Experiments Conclusion. Motivation.
E N D
Kernel Methods for Weakly Supervised Mean Shift Clustering Oncel Tuzel & Fatih Porikli Mitsubishi Electric Research Labs Peter Meer Rutgers University
Outline • Motivation • Mean Shift • Method Overview • Kernel Mean Shift • Constrained Kernel Mean Shift • Experiments • Conclusion
Motivation • Clustering is an ambiguous task • In many cases, the initially designed similarity metric fails to resolve the ambiguities • Simple supervision can guide clustering to desired structure • We present a semi supervised mean shift clustering algorithm based on pair-wise similarities
Mean Shift • Given n data points xi on Rd and associated bandwidths hi, the sample point density estimator is given by where k(x) is the kernel profile • Stationary points of the density can be found via the mean shift procedure where
Mean Shift Clustering • Mean shift iterations are initialized at the data points • The cluster centers are located by the mean shift procedure • The data points associated with the same local maxima of the density function produce a partitioning of the space • There is no systematic semi supervised mean shift algorithm
Method Overview Embedded Space . . . . . . . . . . . . . . . . . . . . . . . . • The supervision is given in the form of a few pair-wise similarity constraints • We embed the input space to a space where the constraint pairs are associated with the same mode • Mode seeking is performed on the embedded space • The method preserves all the advantages of mean shift clustering . . . . . . x . . . . . . . . . . . . . . . . . . . . x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x . . . x . . . . . . . . . . . . . . . x . . x Input Space
Pair-wise Constraints on the Input Space • Data points are projected to the null space of the constraint matrix • Since the constraint point pairs overlap after projection, they are clustered together • The method fails if the clusters are not linearly separable • At most d-1 constraints can be defined Constraint Vector Input Points Clustering Projection
Pair-wise Constraints on the Feature Space • The method can be extended to handle increasing number of constraints or to linearly inseparable case using a mapping function • The mapping embeds the input space to an enlarged feature space • The projection is performed on the feature space • Defining mapping explicitly is not practical Solution: Kernel Trick Mapping to Feature Space Constraint Vector Input Points Clustering Projection
Kernel Mean Shift (Explicit Form) • Given and a p.s.d. kernel satisfying where • The density estimator at is given by • The stationary points can be found via the mean shift procedure
Kernel Mean Shift (Implicit Form) • Let be the dimensional feature matrix and be the dimensional Kernel matrix • At each iteration the estimate, , lies is the column space of and any point on the subspace can be written as • The distance between two points and is given by • The implicit form of mean shift updates the weighting vectors where denote the i-th canonical basis for Rn
Kernel Mean Shift Clustering • The clustering algorithm starts on the data points • Upon convergence the mode can be expressed via • When the rank of the kernel matrix K is smaller than n, columns of form an overcomplete basis and the modes can be identified within an equivalence relationship • The procedure is restricted to the subspace spanned by the feature points therefore • The convergence of the procedure follows from the original proof
Constrained Kernel Mean Shift Feature Space • Let be the set of point pairs to be clustered together • The constraint matrix is given by • The null space of A is the set of vectors and the matrix projects to • Under the projection the constraint point pairs are overlapped Projection
Constrained Kernel Mean Shift The constrained mean shift algorithm implicitly maps the data points to null space of the constraint matrix and performs mean shift on the embedded space This process is equivalent to applying kernel mean shift algorithm with the projected kernel function The projected Kernel matrix only involves mapping through the kernel function and can be expressed in terms of original Kernel matrix where is the part of the Kernel matrix involving constraint set and is the scaling matrix
Experiments • We conduct experiments on three datasets • Synthetic experiments • Clustering faces across illumination on CMU PIE dataset • Clustering object categories on Caltech-4 dataset • For the first two experiments we utilize Gaussian kernel function • For the last experiment we utilize kernel function • We use adaptive bandwidth mean shift where the bandwidth for each point is selected as the k-th smallest distance from the point to all the data points on the feature space
Clustering Linear Structure Data Points Mean Shift Constrained Mean Shift • We generated 240 data points originating from six different lines • Data is corrupted with normally distributed noise with standard deviation 0.1 • Three pair-wise constraints are given
Clustering Circular Structure Data Points Data Points with Outliers • We generated 200 data points originating from five concentric circles • Data is corrupted with normally distributed noise with standard deviation 0.1 • 80 outlier points are added • Four pair-wise constraints are enforced from the same circle Mean Shift Constrained Mean Shift
Clustering Faces Across Illumination Samples from CMU PIE Dataset Constraint Set • Dataset contains 441 images from 21 subjects under 21 different illumination conditions • Images are coarsely registered and scaled to the same size 128x128 • Each image is represented with a 16384-dimensional vector • Two pair-wise similarity constraints are given per subject • Approximately 1/10 of the dataset is labeled
Clustering Faces with Mean Shift Pair-wise Distances Mean Shift • Mean shift finds 5 clusters corresponding to partly illumination conditions, partly subject labels
Clustering Faces with Constrained Mean Shift Pair-wise Distances after Embedding Constrained Mean Shift • Constrained mean shift recovers all 21 subjects perfectly
Clustering Object Categories Samples from Caltech-4 Dataset • Dataset contains 400 images from four object categories: cars, motorcycles, faces, airplanes • Each image is represented with a 500 bin feature histogram • Pair-wise constraints are randomly selected within classes • Experiment is repeated with varying number of constraints (1 to 20 constraints per object class)
Clustering Object Categories with Mean Shift Pair-wise Distances Mean Shift • Some of the samples from airplanes class and half of the motorcycles class are incorrectly identified as cars • The overall clustering accuracy is 74.25%
Clustering Object Categories with Constrained Mean Shift Pair-wise Distances after Embedding Constrained Mean Shift • Clustering example after enforcing 10 constraints per class • Only a single example among 400 is misclustered
Clustering Performance vs. Number of Constraints • The results are averaged over 20 runs where at each run a different constraint set is selected • Clustering accuracy is over 99% for more than 7 constraints per class
Conclusion • We presented a novel constrained mean shift clustering method that can incorporate pair-wise must-link priors • The method preserves all the advantages of the original mean shift clustering algorithm • The presented approach also extends to inner product spaces thus, it is applicable to a wide range of problems