130 likes | 318 Views
FAQ. Olli Virmajoki. UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND. 11.12.2004. Merge Cost Equation. s i = i th cluster of data vertors s ij = cluster formed by merging i th and j th clusters n i = number of data vectors in s i
E N D
FAQ Olli Virmajoki UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND 11.12.2004
Merge Cost Equation • si = i th cluster of data vertors • sij = cluster formed by merging i th and j th clusters • ni = number of data vectors in si • nij = numberof data vectors in sij • = centroid (mean) of the data vectors in si • = centroid (mean) of the data vectors in sij • = average squared error between and the data vectors in si • = average squared error between and the data vectors in sij • = inner product of x and y
Multilevel thresholding • max • max • max
Exact calculation of the removal cost • Data vectors xi in the cluster sa are divided into subclusters sa,j • Removal is conseptually three step process: (1) remove the vectors from the current cluster sa (2) form the subclusters sa,j (3) merge the subclusters to the neighbor clusters sj
Removal cost • The first term is the cost of the cluster before removal • The second term is the sum of the cost values inside the subclusters • The third term is the sum of the costs of merging the subclusters sa,jto their neighbor clusters sj
Number of clusterings • M N iterations to cover the search space • N distinct vertors to M non-distinct codewords lowers the search by M ! • Clusterings(N,M)
Number of clusterings • Consider a number of vectors ordered into groups, one vector at a time • Each vector in turn may: • Either form a new group on its own, or • Combine with other vectors already in a formed group.