Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance

Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance Presenter : Bei-YI JiangAuthors : GuenaelCabanes , YounesBennani , Dominique Fresneau2012. elsevier

Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

Motivation • The exponential growth of data generates terabytes of very large databases. • The growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools.

Objectives • Develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm. • Provide data visualizations via maps and graphs, to provide a comprehensive exploration of the data structure.

Methodology

Methodology-learning data structure • Prototype enrichment Input: The distance matrix Dist(w, x) between the M prototypes w and the N data x. Output: The density Di and the local variability si associated to each prototype wi. The neighborhood values vi,j associated with each pair of prototype wi and wj.

Methodology-learning data structure • Principle • Density modes. • It is a measure of the data density surrounding • the prototype (local density). • Local variability • It can be defined as the average distance between the prototypes and the represented data. • The neighborhood • This is a prototype’s neighborhood measure.

Methodology-learning data structure • Algorithm

Methodology-learning data structure • Clustering of prototypes Input: Density values Di. Neighborhood values vi,j. Output: The clusters of prototypes.

Methodology-learning data structure

Methodology-learning data structure • Presents some interesting qualities • The number of cluster is automatically detected by the algorithm. • No linearly separable clusters and non hyper-spherical clusters • can be detected. • The algorithm can deal with noise (i.e. touching clusters) by • using density estimation.

Methodology-learning data structure • Modeling data distributions • Density function

Methodology-A new two-level coclustering algorithm

Experiments

Conclusions • Propose a new data structure modeling method, based on the learning of prototypes. • Propose a new coclustering algorithm to solve different kind of problems. The results are easy to read and understand, and are perfectly compatible with biologists knowledge. • A method of visualization able to enhance the data structure within and between groups.

Comments • Advantages • Resolve some clustering problems • Obtained results are easy to read and understand • Enhance the data structure • Applications - Analyze and visualize biological experimental

Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance

Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance

Presentation Transcript

Information Visualization with Self-Organizing Maps

Organizing a spectral image database by using Self-Organizing Maps

Self-Organizing Maps

Self-Organizing Maps

Topology-Based Hierarchical Clustering of Self-Organizing Maps

Wireless Localization using Self-Organizing Maps

Self-Organizing Maps

Wireless Localization using Self-Organizing Maps

Self Organizing Maps

Self-Organizing Maps

Probabilistic self-organizing maps for qualitative data

Self Organizing Maps (SOM)

Self Organizing Maps

Model-Based Clustering by Probabilistic Self-Organizing Maps

Self-Organizing Maps

Information Visualization with Self-Organizing Maps

Self-Organizing Maps

Visualizing Ontology Components through Self-Organizing Maps

Boosting (Part II) and Self-Organizing Maps

Self-Organizing Maps (SOM) ( § 5.5)

Self Organizing Maps: Parametrization of Parton Distribution Functions