1 / 16

Automatically Determining the Number of Clusters in Unlabeled Data Sets

Automatically Determining the Number of Clusters in Unlabeled Data Sets. Presenter : Lin, Shu -Han Authors : Liang Wang, Christopher Leckie , Kotagiri Ramamohanarao , and James Bezdek. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (TKD), 2009. Outline. Motivation Objective

pahana
Download Presentation

Automatically Determining the Number of Clusters in Unlabeled Data Sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatically Determining the Number ofClusters in Unlabeled Data Sets Presenter : Lin, Shu-Han Authors : Liang Wang, Christopher Leckie, KotagiriRamamohanarao, and James Bezdek IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING(TKD), 2009

  2. Outline • Motivation • Objective • Methodology • Experiments • Conclusion • Comments

  3. Motivation “reordered dissimilarity image” (RDI) Howtoautomaticallyestimatethenumberofclustersinunlabeleddataset?

  4. Objectives ExtractDarkBlock 4

  5. Methodology– VAT VAT 5

  6. Methodology– VAT VAT 6

  7. Methodology– DBE 1 2 3 4 7

  8. Methodology– DBE1.Dissimilaritytransformationandimagesegmentation f(t) Graythreshfunction(Matlab):σ 8 after before

  9. Methodology– DBE2. Directionalmorphologicalfilteringofthebinaryimage a=2% a=1% Symmetric: along horizontal and vertical directions Linear: along the same direction 9

  10. Methodology– DBE3. Distancetransformanddiagonalprojectionoffilteredimage Nearest non-zero pixel 10

  11. Methodology– DBE4. Detection of major peaks and valleys in the projectionsignal Smooth(parameter:a) Major“peaks/valleys”(parameter:a) 11

  12. Experiments – Syntheticdatasets 12

  13. Experiments– ComparewithCCE 13

  14. Experiments – ComparewithCCE Syntheticdatasets Realdatasets 14

  15. Conclusions • The most method prefer “larger” rather than “smaller” clusters • The DBE • (Nearly) Automatically estimating the number of clusters • Just one easy-to-set parameter: a

  16. Comments • Advantage • An visual assessment of cluster tendency (VAT) • Combine the cluster analysis problem with the image processing tech. • Drawback • … • Application • …

More Related