Kernel k-means: not suitable for large data

KDD, pp. 895-903, ACM, 2011. Kernel k-means: not suitable for large data Two-step kernel k-means base-line Approximate kernel k-means proposal Coffee Talk

Kernel k-means (remark rd) ~ soft (fuzzy) k-means using Parzen density estimation for obtaining object weights (cluster assignment posteriors) n objects, d features, k clusters, l iterations Fuzzy k-means: memory : O(nd + nk) time : O(ndkl) Kernel k-means: memory : O(n2) time : O(n2kl) Coffee Talk

Two-step kernel k-means • Dataset with n points in d dimensions • k << m << n, used for sampling the dataset Step 1: cluster m points by kernel k-means Step 2: assign all n points to nearest mean Stop (no iterations) Memory: O(mn) Time: O(mnd + m2kl + mnk) Coffee Talk

Approximate Kernel k-means • Dataset with n points in d dimensions • k << m << n, used for sampling the dataset Step 1: Compute linear subspace in kernel space of m points Step 2: Use all n points to perform k-means in this space (l iterations) Memory: O(mn) Time: O(mnd+m3+m2n+mnkl), due to mxm matrix inverse Coffee Talk

20000 baseline Coffee Talk

Clustering error (MSE) reduction Coffee Talk

Normalized Mutual Information w.r.t the true class labels Coffee Talk

Kernel k-means: not suitable for large data