1 / 30

MGR: An information theory based hierarchical divisive clustering algorithm for categorical data

MGR: An information theory based hierarchical divisive clustering algorithm for categorical data. Presenter : Bei -YI Jiang Authors : Hongwu Qin, Xiuqin Ma, Tutut Herawan , Jasni Mohamad Zain 2014. KBS. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments.

vine
Download Presentation

MGR: An information theory based hierarchical divisive clustering algorithm for categorical data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MGR: An information theory based hierarchical divisive clustering algorithm for categorical data Presenter : Bei-YI JiangAuthors : Hongwu Qin, Xiuqin Ma, TututHerawan, JasniMohamadZain2014. KBS

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • Many algorithms for clustering categorical have low clustering accuracy while others have high computational complexity.

  4. Objectives • Proposes a new hierarchical divisive clustering algorithm for categorical data, termed MGR, based on information theory. • Achieve better performance and efficiency of clustering.

  5. Methodology

  6. Methodology Information system 1. Mean gain ratio and entropy of cluster 2. Algorithm 3. Computational complexity 4.

  7. Methodology • Information system

  8. Methodology • Mean gain ratio and entropy of cluster

  9. Methodology • Mean gain ratio and entropy of cluster

  10. Methodology • Mean gain ratio and entropy of cluster

  11. Methodology • Mean gain ratio and entropy of cluster

  12. Methodology • Algorithm

  13. Methodology • Algorithm

  14. Methodology • Algorithm

  15. Methodology • Example

  16. Methodology • Comparisons with MMR

  17. Methodology • Comparisons with MMR

  18. Methodology • Comparisons with MMR

  19. Methodology • Comparisons with MMR

  20. Methodology • Comparisons with MMR

  21. Experments • manually label • randomly select 100 English articles from Wikipedia • labeled 3072 concepts that belong to 29044 categories (7780 relevant categories)

  22. Experments

  23. Experments

  24. Experments

  25. Experments

  26. Experments

  27. Experments

  28. Experments

  29. Conclusions • MGR has better clustering accuracy and stability. • MGR has better clustering efficiency and scalability.

  30. Comments • Advantages • better clustering accuracy and stability • without specifying the number of clusters • Applications • Categorical data • Clustering

More Related