180 likes | 344 Views
A fuzzy conceptualization model for text mining with application in opinion polarity classification. Presenter : Jian-Ren Chen Authors : Sheng-Tun Li a,b,* , Fu-Ching Tsai a 2013 , KBS. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments.
E N D
A fuzzy conceptualization model for text mining with application in opinion polarity classification Presenter : Jian-Ren ChenAuthors : Sheng-Tun Lia,b,*, Fu-Ching Tsaia2013 , KBS
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation Most existing document classification algorithms are easily affected by ambiguous terms. The ability to disambiguate for a classifier is thus as important as the ability to classify accurately. - opinion polarity classification
Objectives • We propose a concept driven text classification approach based on Formal Concept Analysis (FCA) to train a classifier using concepts instead of documents, so as to reduce the inherent ambiguities. • We further utilize fuzzy formal concept analysis (FFCA) to take uncertain information into consideration.
Formal concept analysis Objects: {Review6,Review7} Attributes: {Phenomenal, Fantastic, Love} => formal concept negativeclass: ‘‘Awful’’ {Review2, Review3} neutral class: ‘‘Cover’’ {Review5} positiveclass: ‘‘Phenomenal’’, ‘‘Fantastic’’ and ‘‘Love’’ {Review1,Review4, Review6 and Review7}
Formal concept analysis positiveclass: {Review1,Review4, Review6, Review7} negativeclass: {Review2, Review3} neutral class: {Review5}
Methodology tf-idf: Inverted Conformity Frequency (ICF): Uniformity (Uni): tf-idf> 26 ICF < log(2) Uni > 0.2
Experiments -Data set and evaluation • Data set: • Reuter-21578 • movie review • e-book review • Evaluation
Conclusions • FFCM successfully reduce the impact from textual ambiguity. • The results from the experiments show that FFCM outperforms other state-of-the-art algorithms for both Reuters-21578 and two opinion polarity collections.
Comments • Advantages • the formal concepts plays an important role • Disadvantage • αmay differ from variousdatasets • only focuses on single-class classification • Applications • text mining