1 / 36

Unsupervised Learning: K-means Optimization and Soft Clustering

Explore the nuances of K-means clustering optimization objective, random initialization, and determining the number of clusters, along with an introduction to hierarchical clustering and soft clustering (Fuzzy C-Means). Learn how to choose the right value of K and evaluate K-means based on various optimization criteria. Dive into hierarchical clustering techniques and understand the concept of soft clustering for more nuanced cluster assignments. Discover practical applications and methods of optimizing cluster analyses in unsupervised learning.

mhawthorne
Download Presentation

Unsupervised Learning: K-means Optimization and Soft Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning Clustering • Unsupervised Learning • K-means • Optimization objective • Random initialization • Determining Number of Clusters • Hierarchical Clustering • Soft Clustering (Fuzzy C-Means)

  2. References • Nilsson, N. J. (1996). Introduction to machine learning. An early draft of a proposed textbook. (Chapter 9) • Marsland, S. (2014). Machine learning: an algorithmic perspective. CRC press. (Chapter 9) • Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence (Chapter 15) (Fuzzy C-Means) • …

  3. Supervised learning Training set: => Classification: estimating the separator hyperplane

  4. Unsupervised learning Training set: => Clustering

  5. Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison) Applications of Clustering Giant Component Analysis in net Social network analysis Market segmentation Astronomical data analysis Organize computing clusters

  6. K-means Algorithm K: number of clusters First step: random initializing for cluster centers

  7. K-mean Algorithm Second Step: assigning cluster index to samples

  8. K-mean Algorithm Third Step: moving the cluster centroids to the average of the samples in each cluster

  9. K-mean Algorithm

  10. K-mean Algorithm Reassigning samples

  11. K-mean Algorithm Moving the centroid to the average

  12. K-mean Algorithm Reassigning samples

  13. K-mean Algorithm Moving the centroid to the average

  14. K-mean Algorithm Reassigning samples no change!

  15. K-means algorithm • Input: • (number of clusters) • Training set

  16. K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to := index (from 1 to ) of cluster centroid closest to for = 1 to := average (mean) of points assigned to cluster } Moving average Cluster assignment

  17. Distance Metrics • Euclidian distance (L2 norm): • L1 norm: • Cosine Similarity (colleration) (transform to a distance by subtracting from 1):

  18. K-means for non-separated clusters T-shirt sizing Weight Height

  19. Local optima K=3 K<m

  20. Random initialization to escape the local optima For i = 1 to 100 { Randomly initialize K-means. Run K-means. Get . Compute cost function (distortion) } Pick clustering that gave lowest cost

  21. Optimality of clusters • Optimal clusters should • minimize distance within clusters • maximize distance between clusters • Fisher criteria

  22. Content • Unsupervised Learning • K-means • Optimization objective • Random initialization • Determining Number of Clusters • Hierarchical Clustering • Soft Clustering (Fuzzy C-Means)

  23. What is the right value of K?

  24. Choosing the value of K

  25. Choosing the value of K Sometimes, you’re running K-means to get clusters to use for some later purpose. Evaluate K-means based on a metric for how well it performs for that later purpose. E.g.

  26. K-means optimization objective • = index of cluster (1,2,…, ) to which example is currently assigned • = cluster centroid ( ) • = cluster centroid of cluster to which example has been assigned Optimization objective:

  27. K-means optimization objective Randomly initialize cluster centroids Repeat { for = 1 to := index (from 1 to ) of cluster centroid closest to for = 1 to := average (mean) of points assigned to cluster }

  28. Content • Unsupervised Learning • K-means • Optimization objective • Random initialization • Determining Number of Clusters • Hierarchical Clustering • Soft Clustering (Fuzzy C-Means)

  29. Hierarchical clustering: example Clustering important cities in Iran for a business purpose

  30. Hierarchical clustering: example

  31. Hierarchical Clustering: Dendogram

  32. Hierarchical clustering: forming clusters • Forming clusters from dendograms

  33. Hierarchical Clustering • Given the input set S, the goal is to produce a hierarchy (dendrogram) in which nodes represent subsets of S. • Features of the tree obtained: • The root is the whole input set S. • The leaves are the individual elements of S. • The internal nodes are defined as the union of their children. • Each level of the tree represents a partition of the input data into several (nested) clusters or groups.

  34. Hierarchical clustering • Input: a pairwise matrix involved all instances in S • Algorithm • Place each instance of S in its own cluster (singleton), creating the list of clusters L (initially, the leaves of T): L= S1, S2, S3, ..., Sn-1, Sn. • Compute a merging cost function between every pair of elements in L to find the two closest clusters {Si, Sj} which will be the cheapest couple to merge. • Remove Si and Sj from L. • Merge Si and Sj to create a new internal node Sij in T which will be the parent of Si and Sj in the resulting tree. • Go to Step 2 until there is only one set remaining.

  35. Soft Clustering: Fuzzy C-Means • An extension of k-means • Hierarchical k-means generates partitions • each data point can only be assigned in one cluster • Soft clustering gives probabilities that an instance belongs to each of a set of clusters. • Fuzzy c-means allows data points to be assigned into more than one cluster • each data point has a degree of membership (or probability) of belonging to each cluster • Fuzzy C-Means (fcmmatlab command)

  36. Soft Clustering: Fuzzy C-Means

More Related