10 likes | 109 Views
Max-margin Clustering: Detecting Margins from Projections of Points on Lines. Raghuraman Gopalan 1 , and Jagan Sankaranarayanan 2 1 Center for Automation Research, University of Maryland, College Park, MD USA; 2 NEC Labs, Cupertino, CA USA . Multi-cluster Problem.
E N D
Max-margin Clustering: Detecting Margins from Projections of Points on Lines Raghuraman Gopalan1, and Jagan Sankaranarayanan2 1Center for Automation Research, University of Maryland, College Park, MD USA; 2NEC Labs, Cupertino, CA USA Multi-cluster Problem Max-margin Clustering Algorithm • Draw lines between all pairs of points • Estimate the probability of presence of margins between a pair of points xi and xj by computing f(xi,xj) • Perform global clustering using f between all point-pairs Location information of projected points (SI) alone is insufficient to detect margins Results The Role of Distance of Projection Proposition 2 For line intervals in margin region, perpendicular to the separating hyperplane Proposition 3 For line intervals inside a cluster of length more than Mm Proposition 4 An interval with SI having no projected points with distance of projection less than Dmin*, can lie only outside a cluster; where Problem Statement • Given an unlabelled set of points forming k clusters, find a grouping with maximum separating margin among the clusters • Prior work: (Mostly) Establish feedback between different label proposals, and run a supervised classifier on it • Goal: To understand the relation between data points and margin regions by analyzing projections of data on lines Defn: Dmin of a line interval is the minimum distance of projection of points in that interval. No outlier assumption: Max margin between points within a cluster Two-cluster Problem Summary • Assumptions • Linearly separable clusters • Kernel trick for non-linear case • No outliers in data (max margin exist only between clusters) • Enforce global cluster balance ClusteringDetecting margin regions • Obtaining statistics of location and distance of projection of points that are specific to line segments in margin regions (Prop. 1 to 4) • A pair-wise similarity measure to perform clustering, which avoids some optimization-related challenges prevalent in most existing methods A Pair-wise Similarity Measure for Clustering • Proposition 1 • SI* exists ONLY on line segments in margin region that are perpendicular to the separating hyperplane • Such line segments directly provide cluster groupings References • f(xi,xj)=1, iff xi=xj • f(xi,xj)<<1, iff xi and xj are from different clusters, and Intij is perpendicular to their separating hyperplane F. De la Torre, and T. Kanade, “Discriminative cluster analysis”, ICML, pp. 241-248, 2006. ([8] in table) K. Zhang, I.W. Tsang, and J.T. Kwok, “Maximum margin clustering made practical”, IEEE Trans. Neural Networks, 20(4), pp. 583-596, 2009. ([31] in table)