1 / 20

Support Vector Clustering Algorithm

Support Vector Clustering Algorithm. presentation by : Jialiang Wu. Reference paper and code website. Support Vector Clustering by Asa Ben-Hur, David Horn, Hava T. Siegelmann, and Vladimir Vapnik. www.cs.tau.ac.il/~borens/course/ml/cluster.html by Elhanan Borenstein, Ofer,and Orit.

maire
Download Presentation

Support Vector Clustering Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support Vector Clustering Algorithm presentation by : Jialiang Wu

  2. Reference paper and code website • Support Vector Clustering by Asa Ben-Hur, David Horn, Hava T. Siegelmann, and Vladimir Vapnik. • www.cs.tau.ac.il/~borens/course/ml/cluster.htmlby Elhanan Borenstein, Ofer,and Orit.

  3. Clustering Clustering algorithm groups data according to the distance between points. • Points are close to each other will be allocated to the same cluster. • Clustering is most effective is data has some geometric structure. • Outliers may cause unjust increase in cluster size or a fault clustering.

  4. Support Vector Machine(SVM) • SVM maps the data from data space to a higher dimensional feature space through a suitable nonlinear mapping. • Data from two categories can always be separated by a hyper-plane.

  5. Support Vector Machine(SVM) Main Idea: 1.Much of the geometry of the data in the embedding space (relative positions) is contained in all pairwise inner product. We can work in that space by specifying an inner product function between points in it. An explicit mapping is not necessary. 2. In many cases, the inner product have simple kernel representation and therefore can be easily evaluated.

  6. Support Vector Clustering(SVC) • SVC map data from data space to higher dimensional feature space using a Gaussian kernel. • In feature space we look for the smallest sphere the encloses the image of the data. • When the sphere is mapped back to data space, it forms a set of contours, which enclose the data points.

  7. Support Vector Clustering(SVC) • The clustering level is controlled by: 1) q---the width parameter of Gaussian kernel: q increase number of disconnected contour increase, number of clusters increase. 2) C--- the soft margin constant that allow sphere in feature space not to enclose all points.

  8. clustering controlled by q

  9. Cross Dataset:q=0.5,C=1

  10. Cross Dataset:as q grows...

  11. Cross Dataset:as q grows, the number of cluster increase

  12. Circle with noise: #noise pts.=30,q=2,C=1

  13. Circle with noise: #noise pts.=30, q=2,C=1

  14. Circle with noise: #noise pts.=30, q=10,C=1

  15. Circle with noise: #noise pts.=30, q=10,C=1

  16. Circle with noise: #noise pts.=100, q=2,C=1

  17. Circle with noise: #noise pts.=100, q=2,C=1

  18. Conclusions • points located close to one another tend to be allocated to the same cluster. • the number of clusters increase as q grows. • q depends considerably on the specific sample points(scaling, range, scatter,etc.) , there is no one q which is always appropriate. Use drill-down search for dataset is a solution but it's very time consuming. • When samples represent a relatively large number of classes, the SVC in less efficient.

  19. My work on progress • Theoretical exploration: To find out whether there is restriction we can impose on the inner product such that the mapped back figure in the data space is connected (or has only one component). • Importance

  20. Q & A

More Related