1 / 37

Pattern Recognition: Statistical and Neural

Nanjing University of Science & Technology. Pattern Recognition: Statistical and Neural. Lonnie C. Ludeman Lecture 27 Nov 9, 2005. Lecture 27 Topics. K-Means Clustering Algorithm Details K-Means Step by Step Example ISODATA Algorithm -Overview

infinity
Download Presentation

Pattern Recognition: Statistical and Neural

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 27 Nov 9, 2005

  2. Lecture 27 Topics • K-Means Clustering Algorithm Details • K-Means Step by Step Example • ISODATA Algorithm -Overview • 4. Agglomerative Hierarchical Clustering Algorithm Description

  3. K-Means Clustering Algorithm: Basic Procedure Randomly Select K cluster centers from Pattern Space Distribute set of patterns to the cluster center using minimum distance Compute new Cluster centers for each cluster Continue this process until the cluster centers do not change.

  4. Flow Diagram for K-Means Algorithm

  5. Step 1 Initialization Choose K initial Cluster centers M1(1), M2(1), ... , MK(1) Method 1 – First K samples Method 2 – K data samples selected randomly Method 3 – K random vectors Set m = 1 and Go To Step 2

  6. Step 2 Determine New Clusters Using Cluster centers Distribute pattern vectors using minimum distance. Method 1 – Use Euclidean distance Method 2 – Use other distance measures Assign sample xjto class Ck if Go to Step 3

  7. Step 3 Compute New Cluster Centers Using the new Cluster assignment Clk(m) m = 1, 2, ... , K Compute new cluster centers Mk(m+1) m = 1, 2, ... , K using where Nk, k = 1, 2, ... , K is the number of pattern vectors in Clk(m) Go to Step 4

  8. Step 4 Check for Convergence Using Cluster centers from step 3 check for convergence Convergence occurs if the means do not change If Convergence occurs Clustering is complete and the results given. If No Convergence then Go to Step 5

  9. Step 5 Check for Maximum Number of Iterations Define MAXIT as the maximum number of iterations that is acceptable. If m = MAXIT Then display no convergence and Stop. If m < MAXITThen m=m+1 (increment m) and Return to Step 2

  10. Example:K-Means cluster algorithm Given the following set of pattern vectors

  11. Plot of Data points in Given set of samples

  12. Do the following

  13. (a) Solution – 2-class case Initial Cluster centers Plot of Data points in Given set of samples

  14. Initial Cluster Centers Distances from all Samples to cluster centers Cl2 Cl1 Cl2 Cl1 Cl2 Cl2 Cl2 With tie select randomly First Cluster assignment

  15. Closest to x2 Closest to x1 Plot of Data points in Given set of samples

  16. First Cluster Assignment Compute New Cluster centers

  17. New Cluster centers Plot of Data points in Given set of samples

  18. Distances from all Samples to cluster centers 2 2 Cl2 Cl2 Cl1 Cl1 Cl2 Cl2 Cl1 Second Cluster assignment

  19. Old Cluster Center M2(2) New Clusters M1(2) Old Cluster Center Plot of Data points in Given set of samples

  20. Compute New Cluster Centers

  21. ClusterCenters M2(3) New Clusters M1(3) Plot of Data points in Given set of samples

  22. Distances from all Samples to cluster centers 3 3 Cl1 Cl1 Cl1 Cl2 Cl2 Cl2 Cl2 Compute New Cluster centers

  23. (b) Solution: 3-Class case Select Initial Cluster Centers First Cluster assignment using distances from pattern vectors to initial cluster centers

  24. Compute New Cluster centers Second Cluster assignment using distances from pattern vectors to cluster centers

  25. At the next step we have convergence as the cluster centers do not change thus the Final Cluster Assignment becomes

  26. Final 3-Class Clusters Cl3 Cl2 Final Cluster Centers Cl1 Plot of Data points in Given set of samples

  27. Iterative Self Organizing Data Analysis Technique A ISODATA Algorithm Performs Clustering of unclassified quantitative data with an unknown number of clusters Similar to K-Means but with ablity to merge and split clusters thus giving flexibility in number of clusters

  28. ISODATA Parameters that need to be specified merged at each step Requires more specified information than for the K-Means Algorithm

  29. ISODATA Algorithm Final Clustering

  30. Hierarchical Clustering Approach 1 Agglomerative Combines groups at each level Approach 2 Devisive Combines groups at each level Will present only Agglomerative Hierarchical Clustering as it is most used.

  31. Agglomerative Hierarchical Clustering Consider a set S of patterns to be clustered S = { x1, x2, ... , xk, ... , xN} Define Level N by S1(N)= { x1} Clusters at level N are the individual pattern vectors S2(N)= { x2} ... SN(N)= { xN}

  32. Define Level N -1 to be N – 1 Clusters formed by merging two of the Level N clusters by the following process. Compute the distances between all the clusters at level N and merge the two with the smallest distance (resolve ties randomly) to give the Level N-1 clusters as S1(N-1) Clusters at level N -1 result from this merging S2(N-1) ... SN-1(N-1)

  33. The process of merging two clusters at each step is performed sequentially until Level 1 is reached. Level one is a single cluster containing all samples S1(1)= { x1, x2, ... , xk, ... , xN} Thus Hierarchical clustering provides cluster assignments for all numbers of clusters from N to 1.

  34. Definition: A Dendrogram is a tree like structure that illustrates the mergings of clusters at each step of the Hierarchical Approach. A typical dendrogram appears on the next slide

  35. Typical Dendrogram

  36. Summary Lecture 27 • Presented the K-Means Clustering Algorithm Details • Showed Example of Clustering using the K-Means Algorithm (Step by Step) • Briefly discussed the ISODATA Algorithm • 4. Introduced the Agglomerative Hierarchical Clustering Algorithm

  37. End of Lecture 27

More Related