170 likes | 307 Views
A generalized cluster centroid based classifier for text categorization. Presenter : Bei -YI Jiang Authors : Guansong Pang, Shengyi Jiang 2013. Information Processing and Management. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. KNN
E N D
A generalized cluster centroid based classifier for text categorization Presenter : Bei-YI JiangAuthors : Guansong Pang, Shengyi Jiang2013. Information Processing and Management
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation • KNN • With the exponential growth of online textual information, how to organize text data effectively and efficiently has become an important and demanding issue. • Rocchio • Fails to obtain an expressive categorization model due to its inherent linear separability assumption.
Objectives • To strengthen the expressiveness of the Rocchio model. • Employ the improved Rocchio model to speed up the categorization process of KNN.
Methodology • KNN • Rocchio
Methodology • GCC • Determine the threshold
Conclusions • strengthen the expressivenessof the Rocchio model • GCCC and its variants achieve impressive performance • obtain near linear time complexity in modeling • GCCC’s modeling stage is more time-consuming
Comments • Advantages • relatively stable • favorable performance • Applications • online categorization