1 / 20

Local versus Global Interactions in Clustering Algorithms

Computer Engineering Department. 21/03/2010. Local versus Global Interactions in Clustering Algorithms. Wesam M. Ashour. Computer Engineering Department. 21/03/2010. Outline. Clustering? - K-means Clustering Algorithm New algorithms - Weighted K-means (WKM)

creda
Download Presentation

Local versus Global Interactions in Clustering Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Engineering Department 21/03/2010 Local versus Global Interactions in Clustering Algorithms Wesam M. Ashour

  2. Computer Engineering Department 21/03/2010 Outline • Clustering? - K-means Clustering Algorithm • New algorithms - Weighted K-means (WKM) - Inverse Weighted K-means (IWKM) • Topology-Preserving mappings • - Generative Topographic Mapping (GTM) - Inverse-Weighted K-means Topology-Preserving Map (IKToM)

  3. Computer Engineering Department 21/03/2010 Clustering? • Cluster: a collection of data objects • Objects are similar to objects in same cluster • Objects are dissimilar to objects in other clusters • Cluster analysis • Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups • Clustering is unsupervised learning: no predefined classes

  4. Computer Engineering Department 21/03/2010 Clustering? • Partitioning Algorithms • Hierarchical Algorithms • Density based Algorithms • Grid based Algorithms • Graph based Algorithms • Model based Algorithms

  5. Computer Engineering Department 21/03/2010 Clustering? • Pattern Recognition • Compression • Web documents • Biology • Marketing

  6. Computer Engineering Department 21/03/2010 Background K-means The algorithm tries to locate K prototypes throughout a data set in such a way that the K prototypes in some way best represent the data. Disadvantage Specify the number of clusters in advance Sensitivity to prototypes initialization Dead Prototypes Converge to local optimum

  7. Computer Engineering Department 21/03/2010 Weighted K-Means (WKM) • The Performance function for K-means may be written as (1) • Optimization x1 m1 m2 x3 x2 m3

  8. Computer Engineering Department 21/03/2010 Weighted K-Means (cont.) • Consider the following performance function: (2) • Optimization x1 m1 m2 x3 x2 m3

  9. Computer Engineering Department 21/03/2010 Weighted K-Means (cont.) • We wish to form a performance function with following properties: • Minimum performance gives good clustering • Creates a relationship between all data points and all prototypes (3)

  10. Computer Engineering Department 21/03/2010 Weighted K-Means (cont.) Batch Mode All data points come together

  11. Computer Engineering Department 21/03/2010 Weighted K-Means (cont.) • Optimization: generate two sets of updates Let mr be the closest prototype to xi, then Batch Mode (5) (4) Where Vkis the index of data points that are closest to mk and Vj is the index of the other points

  12. Computer Engineering Department 21/03/2010 Weighted K-Means (cont.) • Problem which needs to be solved! (7)

  13. Computer Engineering Department 21/03/2010 Inverse-Weighted K-Means (IWKM) (10) • Optimization • Batch Mode • Find the partial derivative of the performance with respect to mk, assign to zero and then solve for mk (11)

  14. Computer Engineering Department 21/03/2010 Simulation Example 2 Example 1 IWKM IWKM K-means K-means

  15. Computer Engineering Department 21/03/2010 Simulation Example 3 IWKM K-means

  16. Computer Engineering Department 21/03/2010 Simulation Example 5 Example 4 : IWKM IWKM KHMO

  17. Computer Engineering Department 21/03/2010 Inverse-weighted K-means Topology-Preserving Map (IKToM) • Has the same structure as GTM • K latent points in a latent space with some structure • Mapped through M basis functions to feature space • Then mapped to data space to K points using weights W, mk=ΦkW • Use IWKM to find mk

  18. Computer Engineering Department 21/03/2010 Simulation Example 1: Genes data set (40 samples, 3036 dimensions, 3 types) Example 2: Algae data set (72 samples, 18 dimensions, 9 types) Example 3: Glass data set ( 218 samples, 10 dimensions, 6 types

  19. Computer Engineering Department 21/03/2010 Conclusion • WKM and IWKM • Solves the problem of sensitivity to initial conditions in K-means • Provides two sets of updates • Works well in high dimensional data sets • Can be extended for visualization • Visualization • Extension of IWKM • Has the same structure as GTM

  20. Computer Engineering Department 21/03/2010 Thank You Any please question ?

More Related