160 likes | 423 Views
Modified global k-means algorithm for minimum sum-of-squares clustering problems. Presenter : Lin, Shu -Han Authors : Adil M. Bagirov. Pattern Recognition (PR, 2008). Outline. Motivation Objective Methodology Experiments Conclusion Comments. Motivation. k- Means algorithm
E N D
Modified global k-means algorithm forminimum sum-of-squares clustering problems Presenter : Lin, Shu-Han • Authors : Adil M. Bagirov Pattern Recognition (PR, 2008)
Outline • Motivation • Objective • Methodology • Experiments • Conclusion • Comments
Motivation • k-Meansalgorithm • sensitive to the choice of starting points • inefficient for solving clustering problems in large data sets • Global k-Means (GKM) algorithm • incremental algorithm (dynamically adds a cluster center at a time) • uses each data point as a candidate for the k-th cluster center
Objectives Propose a new version of GKM
Methodology– k-Means sensitive to the choice of a starting point 5
Methodology– The GKM algorithm Objectivefunction 6
Methodology– Objectivefunction • Oldversion • Reformulatedversion 7
Methodology– fast GKM algorithm • Oldversion • Proposedversion(auxiliaryclusterfunction) 8
Methodology– modifiedGKM algorithm • Proposedversion 9
Experiments MSk-means:Multi-startk-means GKM:fastGlobalK-Means MGKM:ModifiedGlobalK-Means 11
Experiments 12
Experiments • Overall(14datasets,140results) • The MS k-meansalgorithm finds the best known (or near best known) solutions42 (33.3%) times • GKMalgorithm 76 (60.3%) times • MGKMalgorithm 102 (81.0%) times • Largekinlargedatasets(m) • The MS k-means algorithmfailedto find the best known (or near best known) solutions • GKM algorithmfinds such solutions 22 (45.8%) times • MGKM algorithm42(87.5%) times. 13
Conclusions • AnewversionoftheGKM • Changethecomputationofstartingpoints • Byminimizetheauxiliaryclusterfunction • Giventolerance • IsmoreeffectivethanGKM • largedatasetespecially • Thechoiceofstartingpointsink-meansiscrucial
Comments • Advantage • Theoreticallyanalysis • Drawback • Describewhytheythinktomodifyanythingtheytendtomodifyisimportant,orneedto. • Application • GKMoutperformsk-meansalgorithm