1 / 12

Presenter : Keng -Yu Lin Author : Amir Ahmad , Lipika Dey PRL . 2011

A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets. Presenter : Keng -Yu Lin Author : Amir Ahmad , Lipika Dey PRL . 2011. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

dash
Download Presentation

Presenter : Keng -Yu Lin Author : Amir Ahmad , Lipika Dey PRL . 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets Presenter : Keng-Yu Lin Author : Amir Ahmad , LipikaDey PRL. 2011

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation Almost all subspace clustering algorithms proposed so far are designed for numeric datasets.

  4. Objectives • This paper present a k-means type clustering algorithm that finds clusters in data subspaces in mixed numeric and categorical datasets.

  5. Methodology • k-means clustering algorithm • Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids. • Assign each object to the group that has the closest centroid. • When all objects have been assigned, recalculate the positions of the K centroids. • Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into groups from which the metric to be minimized can be calculated.

  6. Methodology

  7. Experiments error rate : 4.8% Zaki et al. error rate : 3.8% Vote dataset

  8. Experiments error rate : 4.1% Zaki et al. error rate : 0.3% Mushroom datasets

  9. Experiments error rate : 17% DNA datasets

  10. Experiments error rate : 13.9% Huang et al.(2005) error rate: 15% Australian credit data

  11. Conclusions This paper presented a clustering algorithm for subspace clustering for mixed numeric and categorical data.

  12. Comments • Advantage • Applications • Subspace clustering.

More Related