860 likes | 1.16k Views
Spectral Clustering. Jianping Fan Dept of Computer Science UNC, Charlotte. Lecture Outline. Motivation Graph overview and construction Spectral Clustering Cool implementations. Semantic interpretations of clustering clusters. Dataset exhibits complex cluster shapes
E N D
Spectral Clustering Jianping Fan Dept of Computer Science UNC, Charlotte
Lecture Outline • Motivation • Graph overview and construction • Spectral Clustering • Cool implementations
Dataset exhibits complex cluster shapes • K-means performs very poorly in this space due bias toward dense spherical clusters. In the embedded space given by two leading eigenvectors, clusters are trivial to separate. Spectral Clustering Example – 2 Spirals
Spectral Clustering Example Original Points K-means (2 Clusters) Why k-means fail for these two examples?
Lecture Outline • Motivation • Graph overview and construction • Spectral Clustering • Cool implementation
similarity Graph-based Representation of Data Similarity
Lecture Outline • Motivation • Graph overview and construction • Spectral Clustering • Cool implementations
Normalized Cut A graph G(V, E) can be partitioned into two disjoint sets A, B Cut is defined as: Optimal partition of the graph G is achieved by minimizing the cut Min ( )
Normalized Cut Normalized Cut Association between partition set and whole graph
Normalized Cut Normalized Cut becomes Normalized cut can be solved by eigenvalue equation:
K-way Min-Max Cut Intra-cluster similarity Inter-cluster similarity Decision function for spectral clustering
Mathematical Description of Spectral Clustering Refined decision function for spectral clustering We can further define:
Refined decision function for spectral clustering This decision function can be solved as
Spectral Clustering Algorithm Ng, Jordan, and Weiss • Motivation • Given a set of points • We would like to cluster them into k subsets
Algorithm • Form the affinity matrix • Define if • Scaling parameter chosen by user • Define D a diagonal matrix whose (i,i) element is the sum of A’s row i
Algorithm • Form the matrix • Find , the k largest eigenvectors of L • These form the the columns of the new matrix X • Note: have reduced dimension from nxn to nxk
Algorithm • Form the matrix Y • Renormalize each of X’s rows to have unit length • Y • Treat each row of Y as a point in • Cluster into k clusters via K-means
Algorithm • Final Cluster Assignment • Assign point to cluster j iff row i of Y was assigned to cluster j