1 / 13

A Divide-and-Merge Methodology for Clustering

This study introduces a novel divide-and-merge technique combining top-down and bottom-up clustering methods to generate hierarchies and flat clusters effectively. The authors propose a spectral algorithm for the divide phase and dynamic programming for the merge phase. Experimental evaluations show promising results on real-world data.

Download Presentation

A Divide-and-Merge Methodology for Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Divide-and-Merge Methodology for Clustering Advisor : Dr. Hsu Presenter : Hsin-Yi Huang Authors : David Cheng, Ravi Kavnnan, Santosh Vempala and Grant Wang 2007.TODS.27

  2. Outline • Motivation • Objective • Methodology • Divide Phase • Merge Phase • Application • Experiment • Conclusion • Comments

  3. Motivation • Previous algorithms use either top-down or bottom-up methods to construct a hierarchical clustering. • Others produce a flat clustering using local search (e.g., k-mean).

  4. Objective • The authors present a divide-and-merge methodology that combines top-down and bottom-up techniques to create both a hierarchy and a flat clustering. divide merge

  5. Divide Phase • For the divide phase, the authors suggest an efficient spectral algorithm. divide phase

  6. Merge Phase • The authors are trying to maximize the objective function g, the dynamic program will find a clustering COPT-TREE in the tree. error the number of clusters

  7. Merge Phase (cont.) • K-means • Min-Diameter • Min-Sum • Correlation Clustering

  8. Application

  9. Application (cont.) • The authors implemented the methodology in a meta-search engine, and the web site is located at http://eigencluster.csail.mit.edu • The divide phase: spectral algorithm • The merge phase: relaxed correlation clustering The dissimilarity within a cluster The amount of similarly the clustering fails to capture

  10. Experiment • F-Measure • Entropy • Accuracy • Confusion Matrix The columns are the classed Cj The rows are the clusters

  11. Experiment (cont.) • Divide Phase • Reuters • Merge Phase

  12. Conclusion • The authors present a divide-and-merge methodology for clustering. • An efficient and effective spectral algorithm for the divide phase. • For the merge phase, a dynamic programming formulations that compute the optimal tree-respecting clustering for standard objective functions. • The author propose a thorough experimental evaluation of the methodology shows that technique is effective on real-world data.

  13. Comments • Advantage • an interesting idea • Drawback • … • Application • clustering

More Related