130 likes | 316 Views
TreeSOM: Cluster analysis in the self-organizing map. Advisor : Dr. Hsu Presenter : Zih-Hui Lin Author :Elena V. Samsonovaa, Joost N. Kokb, Ad P. IJzermana. Neural Networks 19 (2006) 935–949. Outline. Motivation Objective Introduction Method Conclusions Case studies.
E N D
TreeSOM: Cluster analysis in the self-organizing map Advisor : Dr. Hsu Presenter : Zih-Hui Lin Author :Elena V. Samsonovaa, Joost N. Kokb, Ad P. IJzermana Neural Networks 19 (2006) 935–949
Outline • Motivation • Objective • Introduction • Method • Conclusions • Case studies
Motivation • different map initializations and input order of data elements may result in different clusterings • This is a lengthy and laborious task, so far not automated. • For large data sets it becomes intractable for manual analysis, forcing the user to select a single “good” SOM and accept it as the final result omitting tests of confidence.
Objective • We present a new method for cluster analysis and finding reliable clustering.
Method • 100 SOMs with different random seeds were produced (Protein data) • Map size 5 x 4 • Phase 1: starting learning rate 0.2, starting radius 6, 1000 iterations; • phase 2: starting learning rate 0.02, starting radius 3, 100,000 iterations. • Clustering • Cluster discovery • Calibration • Consensus trees • SOM as a Tree • Clustering confidence • The most representative SOM
Clustering 2.2. Calibration 2.1. Cluster discovery
Consensus tree 3.1. SOM as a tree The Besting clustering
Consensus tree 3.2 Clustering confidence 1.all the nodes in all the trees in the set are evaluated in terms of their leaf sets, and equivalent nodes are grouped. 2.The number of nodes in each equivalence group divided by the total number of nodes in all trees gives the confidence value of each node in the group. 3. 4. The distance between two sets P and Q equals the average distance between each element from P and from Q.
Consensus tree 3.3. The most representative SOM
Conclusions • In this paper we present a new look at self-organizing maps, improving their applicability to clustering problems and facilitating comparisons of clustering results with those of hierarchical classifiers.
Introduction • Protein data • GPCRs • Consensus tree vs. phylogenetic trees
My opinion • Advantage • We can find the most confident clusters easily. • Drawback • …. • Application • …..