240 likes | 325 Views
Weighted Cluster Ensembles: Methods and analysis. Presenter : Chien-Hsing Chen Author: Carlotta Domeniconi Muna Al- Razgan. 2009.TKDD.40. Outline. Motivation Objective Overall of clustering ensemble Method Experiments Conclusion Comment. Motivation. High-dimensional
E N D
Weighted Cluster Ensembles:Methods and analysis Presenter:Chien-Hsing Chen Author: Carlotta Domeniconi Muna Al-Razgan 2009.TKDD.40..
Outline • Motivation • Objective • Overall of clustering ensemble • Method • Experiments • Conclusion • Comment
Motivation • High-dimensional • A dimension (feature) is highly relevant to a cluster, but is irrelevant to another cluster. • Common global dimensionality reduction techniques are unable to capture such local structure of the data. • it instead of • using an equal weight for all w1, w2, …, wD. • using an equal weight for a wi among all clusters, where i=1, …, D, • Clustering ensemble • An ensemble bag includes: K-means, SOM, … etc • Alternative bag is: 3-means, 5-means, 7-means • How can a technique combine the two respects? w=(0.9, 0.8, 0.1)t c1={sport} attribute name homerun baseball shopping w=(0.1, 0.2, 0.9)t c2={auction} w1,i ≠ w2,i
Objective • High-dimensional • provide a first attempt to capture local structure of the data. • LAC-h approach • Clustering ensemble • LAC-1, LAC-3, LAC-29, … • Combine the two respects • WSPA approach • WBPA approach • WSBPA approach w1,i ≠ w2,i
clustering partition Overall work • 1. A new clustering • approach is discussed • handle high-D • Clustering ensemble • 2. Three ensemble • techniques are introduced • consensus function s ( ) > s ( ) 0.25 0.15 0.95 3. Graph cut 0.01 0.13 0.20 0.91 0.20
LAC (locally adaptive clustering) • Clustering ensemble • distance of a attribute i within a cluster j c1 w=(0.9, 0.5, 0.1)t ? |nc1| = 4 c2 w=(0.1, 0.5, 0.9)t |nc2| = 3
Overall work • 1. A new clustering • approach is discussed • handle high-D • Clustering ensemble c1 • 2. Three ensemble • techniques are introduced • consensus function s ( ) > s ( ) 0.25 0.15 0.95 3. Graph cut 0.01 0.13 0.20 0.91 0.20
WSPA 1/2 0.04 0.94 0.02 0.06 0.90 0.04 s ( ) P =(0.94, 0.04, 0.02)t P =(0.90, 0.06, 0.02)t
WSPA 2/2 • Clustering ensemble • Two points have high similarity score if often appearing in the same partitions. • Instance-based Graph cut 0.25 0.15 0.13 0.95 0.01 0.20 0.91 0.20
WBPA 1/3 • Problem definition … 0.04 0.02 0.91 0.94 0.03 0.06 P =(0.94, 0.04, 0.02)t P =(0.03, 0.91, 0.06)t • and are never clustered together ≡ 0 Graph • the groups to which and belong share the same instances
WBPA 2/3 Graph The Graph is connect between a cluster and an instance instead of that among data
WBPA 3/3 0.64 0.94
WSBPA 0.94 0.94 0.93 0.91 0.86 0.86 0.85 0.89 0.93 0.94 0.64
WSBPA 0.04 0.04 0.03 0.01 0.86 0.86 0.85 0.89 0.01 0.89 0.94 0.64
w1,i ≠ w2,i Experiment
Conclusion • High-dimensional • LAC-h approach • Clustering ensemble • LAC-1, LAC-3, LAC-29, … • Combine the two respects • WSPA approach • WBPA approach • WSBPA approach
Comment • Advantage • Consensus function • Drawback • Application • Ensemble clustering on SOM