1 / 23

Hierarchical Clustering

Hierarchical Clustering. Dr. Bernard Chen Assistant Professor. Outline. Hierarchical Clustering Hybrid Hierarchical Kmeans clustering DBscan. Hierarchical Clustering. Venn Diagram of Clustered Data. Dendrogram.

Download Presentation

Hierarchical Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Clustering Dr. Bernard Chen Assistant Professor

  2. Outline • Hierarchical Clustering • Hybrid Hierarchical Kmeans clustering • DBscan

  3. Hierarchical Clustering Venn Diagram of Clustered Data Dendrogram From http://www.stat.unc.edu/postscript/papers/marron/Stat321FDA/RimaIzempresentation.ppt

  4. Nearest Neighbor, Level 2, k = 1 clusters. From http://www.stat.unc.edu/postscript/papers/marron/Stat321FDA/RimaIzempresentation.ppt

  5. Nearest Neighbor, Level 3, k = 2 clusters.

  6. Nearest Neighbor, Level 4, k = 3 clusters.

  7. Nearest Neighbor, Level 5, k = 2 clusters.

  8. Nearest Neighbor, Level 6, k = 2 clusters.

  9. Nearest Neighbor, Level 7, k = 2 clusters.

  10. Nearest Neighbor, Level 8, k = 1 cluster.

  11. Typical Alternatives to Calculate the Distance between Clusters • Single link: smallest distance between an element in one cluster and an element in the other, i.e., dis(Ki, Kj) = min(tip, tjq) • Complete link: largest distance between an element in one cluster and an element in the other, i.e., dis(Ki, Kj) = max(tip, tjq) • Average: avg distance between an element in one cluster and an element in the other, i.e., dis(Ki, Kj) = avg(tip, tjq)

  12. Functional significant gene clusters Two-way clustering Sample clusters Gene clusters

  13. Outline • Hierarchical Clustering • Hybrid Hierarchical Kmeans clustering • DBscan

  14. Motivation • Among clustering algorithms, Hierarchical and K-means clustering are the two most popular and classic methods. However, both have their innate disadvantages. • K-means clustering requires a specified number of clusters in advance and chooses initial centroids randomly; in other words, you don’t know how to start • Hierarchical clustering is hard to find a place to cut

  15. Hybrid Hierarchical K-means Clustering (HHK) Algorithm • The brief idea is we cluster around half data through Hierarchical clustering and succeed by K-means for the remaining • In order to generate super-rules, we let Hierarchical terminate when it generates the largest number of clusters

  16. Hybrid Hierarchical K-means Clustering (HHK) Algorithm

  17. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

  18. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

  19. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

  20. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

  21. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

  22. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

  23. Hybrid Hierarchical K-means Clustering (HHK) Algorithm Example

More Related