1 / 15

MONIC - Modeling and Monitoring Cluster Transitions

MONIC - Modeling and Monitoring Cluster Transitions. M. Spiliopoulou, I. Ntoutsi, Y. Theodoridis, and R. Schult Proceeding of the 12th International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD, 2006. 報告人 : 吳建良. Outline. Motivation Cluster Model in MONIC

birkland
Download Presentation

MONIC - Modeling and Monitoring Cluster Transitions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MONIC - Modeling and Monitoring Cluster Transitions M. Spiliopoulou, I. Ntoutsi, Y. Theodoridis, and R. Schult Proceeding of the 12th International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD, 2006. 報告人:吳建良

  2. Outline • Motivation • Cluster Model in MONIC • Cluster Transitions in MONIC • Experimental Results

  3. Motivation • Example: data records at for timepoints • Categorize and tracing the changes upon clusters • Did some clusters disappear? • Were clusters absorbed by others? • When is a cluster the same? • When is a cluster mutate? • MONIC provides insights about the nature of cluster change in the whole clustering

  4. Cluster Model in MONIC • Data stream application • Assume re-clustering at each timepoint • Adopt arbitrary clustering methods • Monitor both changes in existing clusters and new clusters • Data record • for i≠j Initial dataset

  5. Data ageing function • Assign lower weights for old records • Data ageing function • assign a weight to data record x at ti for each and for each ti • This function can be covered by sliding windows • The weights of records outside the window are zero

  6. Cluster Matching • Cluster overlap • Overlap of X to Y • Cluster match • Y is a match for X in Cj subject to τ

  7. Cluster Transitions in MONIC • External transitions • Survive: • Split into multiple clusters: where

  8. Cluster Transitions in MONIC contd. • Absorb: • Disappear: • None of the above cases holds for X • Emerge:

  9. Cluster Transitions in MONIC contd. • Internal transitions • Size transition: weights of the records • Shrink: • Expand: • Compactness transition: data distribution • Compacter: • Diffuser:

  10. Cluster Transitions in MONIC contd. • Location transition • Shift of center: • Skewness: • No change • Property of transition • Inside a group of transition → mutually exclusive • Among different groups of transition → combined • Ex: a cluster X matched by Y can become larger and more compact.

  11. Cluster Transitions in MONIC contd. • Lifetime of clustering • Use lifetime of clusterings to gain insights on the evolution of the population • Survival ratio • Absorption ratio • Passforward ratio= Survival ratio + Absorption ratio

  12. Experimental • Dataset • ACM library section H2.8 on “database application” • 6 classes:(1) data mining, (2) spatial databases, (3) image databases, (4) statistical databases, (5) scientific databases, (6) uncategorized documents • Time: 1997~2004 • Document: Title and list of keywords • Feature space: 30 most frequent (TF×IDF-weighted words) • Clustering algorithm: K-means for K=10 • Data aging • Sliding window of size 2

  13. Cluster transitions and threshold impact Fix τsplit = 0.1 Vary τ from 0.45 to 0.7

  14. Cluster transitions and threshold impact Fix τ= 0.5 Vary τsplit from 0.1 to 0.35

  15. Lifetime of clusterings • Passforward ratios for different τ

More Related