160 likes | 380 Views
Speech and Image Processing Unit Department of Computer Science University of Joensuu, FINLAND. Self-organizing map. Clustering Methods: Part 9. Pasi Fränti. SOM main principles. Self-organizing map (SOM) is a clustering method suitable especially for visualization.
E N D
Speech and Image Processing UnitDepartment of Computer Science University of Joensuu, FINLAND Self-organizing map Clustering Methods: Part 9 Pasi Fränti
SOM main principles • Self-organizing map (SOM) is a clustering method suitable especially for visualization. • Clustering represented by centroids organized in a 1-d or 2-d network. • Dimensionality reduction and visualization possibility achieved as side product. • Clustering performed by competitive learning principle.
Self-organizing mapInitial configuration M nodes, one foreach cluster Nodes connected by networktopology (1-d or 2-d) Initial locationsnot important
Self-organizing mapFinal configuration Node locations adapt during learning stage Network keeps neighbor vectors close to each other Network limits the movement of vectors during learning
(1/2) SOM pseudo code Learning stage
(2/2) SOM pseudo code Update centroids
Competitive learning • Each data vector is processed once. • Find nearest centroid: • The centroid is updated by moving it towards the data vector by: • Learning stage similar to k-means but centroid update has different principle.
Learning rate () • Decreases with time movement is large in the beginning but eventually stabilizes. • Linear decrease of weighting: • Exponential decrease of weighting:
Neighborhood (d) Neighboring centroids are also updated: Effect is stronger for nearby centroids:
Weighting of the neighborhood Weighting decreases exponentially
Parameter setup • Number of iterations T • Convergence of SOM is rather slow Should be set as high as possible • Roughly 100-1000 iterations at minimum. • Size of the initial neighborhood Dmax • Small enough to allow local adaption. • Value D=0 indicates no neighbor structure • Maximum learning rate A • Higher values have mostly random effect. • Most critical are the final stages (D2) • Optimal choices of A and Dmax highly correlated.
Difficulty of parameter setup Fixing total number of iterations (TDmax) to 20, 40 and 80. Optimal parameter combination non-trivial.
Adaptive number of iterations • To reduce the effect of parameter set-up, should be as high as possible. • Enough time to adapt at the cost of high time complexity. • Adaptive number of iterations: • For Dmax=10 and Tmax=100: Ti = {1, 1, 1, 1, 2, 3, 6, 13, 25, 50, 100}
Example of SOM (1-d) One cluster missing One cluster too many
Example of SOM (2-d) (to appear sometime in future)
Literature • T. Kohonen, Self-Organization and Associative Memory. Springer-Verlag, New York, 1988. • N.M. Nasrabadi and Y. Feng, "Vector quantization of images based upon the Kohonen self-organization feature maps", Neural Networks, 1 (1), 518, 1988. • P. Fränti, "On the usefulness of self-organizing maps for the clustering problem in vector quantization", 11th Scandinavian Conf. on Image Analysis (SCIA’99), Kangerlussuaq, Greenland, vol. 1, 415-422, 1999.