220 likes | 343 Views
Selecting Attributes for Sentiment Classification Using Feature Relation Networks. Presenter : Jian-Ren Chen Authors : Ahmed Abbasi, Stephen France, Zhu Zhang , and Hsinchun Chen 2011 , IEEE TKDE. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments.
E N D
Selecting Attributes for Sentiment Classification Using Feature Relation Networks Presenter : Jian-Ren ChenAuthors : Ahmed Abbasi, Stephen France, Zhu Zhang,and Hsinchun Chen2011 , IEEETKDE
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation Sentiment analysis has emerged as a method for miningopinions from such text archives. challenging problem: • requires the use of large quantities of linguistic features • integrate these heterogeneous n-gram categories into a single feature set - noise、redundancy and computationallimitations polarity intensity I don’t like you、I hate you
n-gram-(Markov model) 天氣:晴天、陰天、雨天 美麗vs美痢 “HAPAX” and “DIS” tags I hate Jim replaced with “I hate HAPAX”
Objectives • Feature Relation Network (FRN) considers semantic information and also leverages the syntactic relationships between n-gram features. • - enhanced sentiment classificationon extended sets of heterogeneous n-gram features.
Methodology - Subsumption Relations A subsumes B(A → B) “I love chocolate” unigram : I, LOVE, CHOCOLATE bigrams:I LOVE, LOVE CHOCOLATE trigrams :I LOVE CHOCOLATE What about the bigrams and trigrams? It depends on their weight. Their weight exceeds that of their general lower order counterparts by threshold t.
Methodology- Parallel Relations A parallel B(A - B) POS tag:“ADMIRE_VP”→ “like” semantic class: “SYN-Affection”→ “love” A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.
Experiments-Parameter t (0.0005, 0.005, 0.05, and 0.5) p (0.80, 0.90, and 1.00)
Conclusions • FRN had significantly higher best accuracy and bestpercentagewithin-one across three testbeds. • The ablation and parameter testing results play an important role for the subsumption and parallel relation thresholds.
Comments • Advantages - accuracy、computationally efficient • Disadvantage - ablation and parameter is sensitive • Applications - sentiment classification - feature selection method