Pattern Recognition: Statistical and Neural

Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 27 Nov 9, 2005

Lecture 27 Topics • K-Means Clustering Algorithm Details • K-Means Step by Step Example • ISODATA Algorithm -Overview • 4. Agglomerative Hierarchical Clustering Algorithm Description

K-Means Clustering Algorithm: Basic Procedure Randomly Select K cluster centers from Pattern Space Distribute set of patterns to the cluster center using minimum distance Compute new Cluster centers for each cluster Continue this process until the cluster centers do not change.

Flow Diagram for K-Means Algorithm

Step 1 Initialization Choose K initial Cluster centers M1(1), M2(1), ... , MK(1) Method 1 – First K samples Method 2 – K data samples selected randomly Method 3 – K random vectors Set m = 1 and Go To Step 2

Step 2 Determine New Clusters Using Cluster centers Distribute pattern vectors using minimum distance. Method 1 – Use Euclidean distance Method 2 – Use other distance measures Assign sample xjto class Ck if Go to Step 3

Step 3 Compute New Cluster Centers Using the new Cluster assignment Clk(m) m = 1, 2, ... , K Compute new cluster centers Mk(m+1) m = 1, 2, ... , K using where Nk, k = 1, 2, ... , K is the number of pattern vectors in Clk(m) Go to Step 4

Step 4 Check for Convergence Using Cluster centers from step 3 check for convergence Convergence occurs if the means do not change If Convergence occurs Clustering is complete and the results given. If No Convergence then Go to Step 5

Step 5 Check for Maximum Number of Iterations Define MAXIT as the maximum number of iterations that is acceptable. If m = MAXIT Then display no convergence and Stop. If m < MAXITThen m=m+1 (increment m) and Return to Step 2

Example:K-Means cluster algorithm Given the following set of pattern vectors

Plot of Data points in Given set of samples

Do the following

(a) Solution – 2-class case Initial Cluster centers Plot of Data points in Given set of samples

Initial Cluster Centers Distances from all Samples to cluster centers Cl2 Cl1 Cl2 Cl1 Cl2 Cl2 Cl2 With tie select randomly First Cluster assignment

Closest to x2 Closest to x1 Plot of Data points in Given set of samples

First Cluster Assignment Compute New Cluster centers

New Cluster centers Plot of Data points in Given set of samples

Distances from all Samples to cluster centers 2 2 Cl2 Cl2 Cl1 Cl1 Cl2 Cl2 Cl1 Second Cluster assignment

Old Cluster Center M2(2) New Clusters M1(2) Old Cluster Center Plot of Data points in Given set of samples

Compute New Cluster Centers

ClusterCenters M2(3) New Clusters M1(3) Plot of Data points in Given set of samples

Distances from all Samples to cluster centers 3 3 Cl1 Cl1 Cl1 Cl2 Cl2 Cl2 Cl2 Compute New Cluster centers

(b) Solution: 3-Class case Select Initial Cluster Centers First Cluster assignment using distances from pattern vectors to initial cluster centers

Compute New Cluster centers Second Cluster assignment using distances from pattern vectors to cluster centers

At the next step we have convergence as the cluster centers do not change thus the Final Cluster Assignment becomes

Final 3-Class Clusters Cl3 Cl2 Final Cluster Centers Cl1 Plot of Data points in Given set of samples

Iterative Self Organizing Data Analysis Technique A ISODATA Algorithm Performs Clustering of unclassified quantitative data with an unknown number of clusters Similar to K-Means but with ablity to merge and split clusters thus giving flexibility in number of clusters

ISODATA Parameters that need to be specified merged at each step Requires more specified information than for the K-Means Algorithm

ISODATA Algorithm Final Clustering

Hierarchical Clustering Approach 1 Agglomerative Combines groups at each level Approach 2 Devisive Combines groups at each level Will present only Agglomerative Hierarchical Clustering as it is most used.

Agglomerative Hierarchical Clustering Consider a set S of patterns to be clustered S = { x1, x2, ... , xk, ... , xN} Define Level N by S1(N)= { x1} Clusters at level N are the individual pattern vectors S2(N)= { x2} ... SN(N)= { xN}

Define Level N -1 to be N – 1 Clusters formed by merging two of the Level N clusters by the following process. Compute the distances between all the clusters at level N and merge the two with the smallest distance (resolve ties randomly) to give the Level N-1 clusters as S1(N-1) Clusters at level N -1 result from this merging S2(N-1) ... SN-1(N-1)

The process of merging two clusters at each step is performed sequentially until Level 1 is reached. Level one is a single cluster containing all samples S1(1)= { x1, x2, ... , xk, ... , xN} Thus Hierarchical clustering provides cluster assignments for all numbers of clusters from N to 1.

Definition: A Dendrogram is a tree like structure that illustrates the mergings of clusters at each step of the Hierarchical Approach. A typical dendrogram appears on the next slide

Typical Dendrogram

Summary Lecture 27 • Presented the K-Means Clustering Algorithm Details • Showed Example of Clustering using the K-Means Algorithm (Step by Step) • Briefly discussed the ISODATA Algorithm • 4. Introduced the Agglomerative Hierarchical Clustering Algorithm

End of Lecture 27

Pattern Recognition: Statistical and Neural