220 likes | 453 Views
K -means Clustering via Principal Component Analysis. According to the paper by Chris Ding and Xiaofeng He from Int’l Conf. Machine Learning, Banff, Canada, 2004. Traditional K -means Clustering. Minimizing the sum of squared errors. Where data matrix. Centroid of cluster C k.
E N D
K-means Clustering via Principal Component Analysis According to the paper by Chris Ding and Xiaofeng He from Int’l Conf. Machine Learning, Banff, Canada, 2004
Traditional K-means Clustering Minimizing the sum of squared errors Where data matrix Centroid of cluster Ck nk is the number of points in Ck
Principal Component Analysis (PCA) Centered data matrix Covariance matrix is ignored Factor
PCA - continuation Eigenvalues and eigenvectors Singular value decomposition (SVD)
K-means → PCA Indikator vectors Criterion Linear transform by K× Korthonormal matrix T Last column of T
K-means → PCA - continuation Therefore Criterion Optimization becomes Solution is first K-1 principal components
PCA→ K-means Clustering by PCA Probability of connectivity between i and j
Eigenvalues • 1. case 164030, 58, 5 • 2. case 212920, 1892, 157