300 likes | 508 Views
MGR: An information theory based hierarchical divisive clustering algorithm for categorical data. Presenter : Bei -YI Jiang Authors : Hongwu Qin, Xiuqin Ma, Tutut Herawan , Jasni Mohamad Zain 2014. KBS. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments.
E N D
MGR: An information theory based hierarchical divisive clustering algorithm for categorical data Presenter : Bei-YI JiangAuthors : Hongwu Qin, Xiuqin Ma, TututHerawan, JasniMohamadZain2014. KBS
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation • Many algorithms for clustering categorical have low clustering accuracy while others have high computational complexity.
Objectives • Proposes a new hierarchical divisive clustering algorithm for categorical data, termed MGR, based on information theory. • Achieve better performance and efficiency of clustering.
Methodology Information system 1. Mean gain ratio and entropy of cluster 2. Algorithm 3. Computational complexity 4.
Methodology • Information system
Methodology • Mean gain ratio and entropy of cluster
Methodology • Mean gain ratio and entropy of cluster
Methodology • Mean gain ratio and entropy of cluster
Methodology • Mean gain ratio and entropy of cluster
Methodology • Algorithm
Methodology • Algorithm
Methodology • Algorithm
Methodology • Example
Methodology • Comparisons with MMR
Methodology • Comparisons with MMR
Methodology • Comparisons with MMR
Methodology • Comparisons with MMR
Methodology • Comparisons with MMR
Experments • manually label • randomly select 100 English articles from Wikipedia • labeled 3072 concepts that belong to 29044 categories (7780 relevant categories)
Conclusions • MGR has better clustering accuracy and stability. • MGR has better clustering efficiency and scalability.
Comments • Advantages • better clustering accuracy and stability • without specifying the number of clusters • Applications • Categorical data • Clustering