90 likes | 228 Views
Max-margin Clustering: Detecting Margins from Projections of Points on Lines. Raghuraman Gopalan 1 , and Jagan Sankaranarayanan 2 1 Center for Automation Research, University of Maryland, College Park, MD USA 2 NEC Labs, Cupertino, CA USA E-mail: { raghuram,jagan }@umiacs.umd.edu .
E N D
Max-margin Clustering: Detecting Margins from Projections of Points on Lines Raghuraman Gopalan1, and Jagan Sankaranarayanan2 1Center for Automation Research, University of Maryland, College Park, MD USA 2NEC Labs, Cupertino, CA USA E-mail: {raghuram,jagan}@umiacs.umd.edu
Problem Statement • Given an unlabelled set of points forming k clusters, find a grouping with maximum separating margin among the clusters • Prior work: (Mostly) Establish feedback between different label proposals, and run a supervised classifier on it • Goal: To understand the relation between data points and margin regions by analyzing projections of data on lines
Two-cluster Problem • Assumptions • Linearly separable clusters • Kernel trick for non-linear case • No outliers in data (max margin exist only between clusters) • Enforce global cluster balance • Proposition 1 • SI* exists ONLY on line segments in margin region that are perpendicular to the separating hyperplane • Such line segments directly provide cluster groupings
Multi-cluster Problem SI* doesn’t exist Location information of projected points (SI) alone is insufficient to detect margins
The Role of Distance of Projection Proposition 2 For line intervals in margin region, perpendicular to the separating hyperplane, Proposition 3 For line intervals inside a cluster of length more than Mm, Proposition 4 An interval with SI having no projected points with distance of projection less than Dmin*, can lie only outside a cluster; where γ2 CL2 γ3 CL3 CL1 Defn: Dmin of a line interval is the minimum distance of projection of points in that interval. No outlier assumption: Max margin between points within a cluster γ1
A Pair-wise Similarity Measure for Clustering • f(xi,xj)=1, iff xi=xj • f(xi,xj)<<1, iff xi and xj are from different clusters, and Intij is perpendicular to their separating hyperplane
Max-margin Clustering Algorithm • Draw lines between all pairs of points • Estimate the probability of presence of margins between a pair of points xi and xj by computing f(xi,xj) • Perform global clustering using f between all point-pairs
Results 3D 2D
Summary ClusteringDetecting margin regions • Obtaining statistics of location and distance of projection of points that are specific to line segments in margin regions (Prop. 1 to 4) • A pair-wise similarity measure to perform clustering, which avoids some optimization-related challenges prevalent in most existing methods