220 likes | 299 Views
Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance. Presenter : Bei -YI Jiang Authors : Guenael Cabanes , Younes Bennani , Dominique Fresneau 2012. elsevier. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.
E N D
Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance Presenter : Bei-YI JiangAuthors : GuenaelCabanes , YounesBennani , Dominique Fresneau2012. elsevier
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation • The exponential growth of data generates terabytes of very large databases. • The growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools.
Objectives • Develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm. • Provide data visualizations via maps and graphs, to provide a comprehensive exploration of the data structure.
Methodology-learning data structure • Prototype enrichment Input: The distance matrix Dist(w, x) between the M prototypes w and the N data x. Output: The density Di and the local variability si associated to each prototype wi. The neighborhood values vi,j associated with each pair of prototype wi and wj.
Methodology-learning data structure • Principle • Density modes. • It is a measure of the data density surrounding • the prototype (local density). • Local variability • It can be defined as the average distance between the prototypes and the represented data. • The neighborhood • This is a prototype’s neighborhood measure.
Methodology-learning data structure • Algorithm
Methodology-learning data structure • Clustering of prototypes Input: Density values Di. Neighborhood values vi,j. Output: The clusters of prototypes.
Methodology-learning data structure • Algorithm
Methodology-learning data structure • Algorithm
Methodology-learning data structure • Presents some interesting qualities • The number of cluster is automatically detected by the algorithm. • No linearly separable clusters and non hyper-spherical clusters • can be detected. • The algorithm can deal with noise (i.e. touching clusters) by • using density estimation.
Methodology-learning data structure • Modeling data distributions • Density function
Methodology-learning data structure • Algorithm
Conclusions • Propose a new data structure modeling method, based on the learning of prototypes. • Propose a new coclustering algorithm to solve different kind of problems. The results are easy to read and understand, and are perfectly compatible with biologists knowledge. • A method of visualization able to enhance the data structure within and between groups.
Comments • Advantages • Resolve some clustering problems • Obtained results are easy to read and understand • Enhance the data structure • Applications - Analyze and visualize biological experimental