1 / 17

A generalized cluster centroid based classifier for text categorization

A generalized cluster centroid based classifier for text categorization. Presenter : Bei -YI Jiang Authors : Guansong Pang, Shengyi Jiang 2013. Information Processing and Management. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. KNN

royce
Download Presentation

A generalized cluster centroid based classifier for text categorization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A generalized cluster centroid based classifier for text categorization Presenter : Bei-YI JiangAuthors : Guansong Pang, Shengyi Jiang2013. Information Processing and Management

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • KNN • With the exponential growth of online textual information, how to organize text data effectively and efficiently has become an important and demanding issue. • Rocchio • Fails to obtain an expressive categorization model due to its inherent linear separability assumption.

  4. Objectives • To strengthen the expressiveness of the Rocchio model. • Employ the improved Rocchio model to speed up the categorization process of KNN.

  5. Methodology • KNN • Rocchio

  6. Methodology

  7. Methodology

  8. Methodology • GCC • Determine the threshold

  9. Experiments

  10. Experiments

  11. Experiments

  12. Experiments

  13. Experiments

  14. Experiments

  15. Experiments

  16. Conclusions • strengthen the expressivenessof the Rocchio model • GCCC and its variants achieve impressive performance • obtain near linear time complexity in modeling • GCCC’s modeling stage is more time-consuming

  17. Comments • Advantages • relatively stable • favorable performance • Applications • online categorization

More Related