290 likes | 390 Views
Authors:. Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya + * IITB- Monash Research Academy, IIT Bombay + Dept. of Computer Science and Engineering, IIT Bombay. Leveraging Sentiment to Compute Word Similarity. GWC 2012, Matsue, Japan.
E N D
Authors: Balamurali A R*,+SubhabrataMukherjee+ Akshat Malu+ Pushpak Bhattacharyya+ *IITB-Monash Research Academy, IIT Bombay +Dept. of Computer Science and Engineering, IIT Bombay Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Slides Acknowledgement: AkshatMalu
Roadmap • Similarity Metrics • SenSim Metric: Our Approach • Evaluation • Results & Conclusion
Similarity Metrics • An unavoidable component in an NLP system • Example :Word Sense disambiguation (Banerjee & Pedersen 2002), malapropism detection (Hirst & St-Onge,1997) • Underlying principle: Distributional similarity in terms of their meaning. • Example: Refuge and Asylum are similar • Existing approaches: Finding the similarity between a word pair based on their meaning (definition)
Similarity Metrics – Is meaning alone enough? Asylum Refuge Mad house Mad house
Similarity & Sentiment • Our hypothesis “Knowing the sentiment content of the words is beneficial in measuring their similarity”
SenSim Metric • Using sentiment along with the meaning of the words to calculate their similarity • The gloss of the synset is the most informative piece • We leverage it in calculating both, the meaning based similarity as well as the sentiment similarity of the word pair • We use a gloss-vector based approach with cosine similarity in our metric
Gloss Vector • Gloss vector is created by representing all the words of the gloss in the form of a vector • Assumption: Synset for the word is already known • Each dimension of the gloss vector represents a sentiment score of the respective content word • Sentiment scores are obtained from different scoring functionsbased on an external lexicon • SentiWordNet 1.0 is used as the external lexicon • Problem: The vector thus formed is too sparse
Augmenting Gloss Vector • To counter sparsity of gloss vectors, they are augmented with the gloss of the related synsets • The context is further extended by adding the gloss of synsets of the words present in the gloss of the original word Area Hyponymy Hyponymy Country Refuge Hypernymy Harborage a shelter from danger or hardship a shelter from danger or hardship a shelter from danger or hardship a shelter from danger or hardship a shelter from danger or hardship a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury; a state of misfortune or affliction a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge a shelter from danger or hardship; a particular geographical region of indefinite boundary a shelter from danger or hardship a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury; a state of misfortune or affliction a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger
Scoring Functions • Sentiment Difference (SD) • Difference between the positive and negative sentiment values • Sentiment Max (SM) • The greater of the positive and negative sentiment values • Sentiment Threshold Difference (TD) • Same as SD but with a minimum threshold value • Sentiment Threshold Max (TM) • Same as SM but with a minimum threshold value ScoreSD(A) = SWNpos(A) – SWNneg(A) ScoreSM(A) = max(SWNpos(A), SWNneg(A)) ScoreTD(A) = sign(max(SWNpos(A) , SWNneg(A)) )* (1+abs(SWNpos(A) – SWNneg(A))) ScoreTM(A) = sign(max(SWNpos(A), SWNneg(A))) * (1+abs(max(SWNpos(A), SWNneg(A))))
SenSim Metric SenSim_x(A,B) = cosine(glossvec(sense(A),glossvec(sense(B)) glossvec = 1:score_x(1) 2:score_x(2)............ n:score_x(n) Score_X (Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM
Evaluation Evaluation Intrinsic Extrinsic Correlation with Human Annotators Correlation with Human Annotators Correlation between other Metrics Correlation between other Metrics Synset Replacement using Similarity metrics Synset Replacement using Similarity metrics Comparing scores* given by SenSim with those given by human annotators * all scores normalized to a scale of 1-5 (1-least similar, 5-most similar) Annotation based on meaning Annotation based on sentiment and meaning combined
Synset Replacement using Similarity metrics • Unknown feature problem in supervised classification– • Metrics used: • LIN (Lin, 1998) ,LCH (Leacock and Chodorow 1998), Lesk(Banerjee and Pedersen, 2002) Training corpus(Synset) Test corpus(Synset) Get Train_Synset_List If Test_Synset T is not in Train_synset_list Using similarity metric S find a similar synset R Yes Get Test_Synset_List No New Test corpus(Synset) Replace T with R
Results – Intrinsic Evaluation (1/4) • Sentiment as a parameter for finding similarity (1/2) • Adding sentiment to the context yields better correlation among the annotators • The decrease in correlation on adding sentiment in case of NOUN can be because sentiment does not play that important role in this case Pearson Correlation Coefficient between two annotators for various annotation strategies
Results – Intrinsic Evaluation (2/4) • Sentiment as a parameter for finding similarity (2/2) Pearson Correlation(r) of various metrics with Gold standard data * all experiments performed on gold standard data consisting 48 sense-marked word pairs
Results- Extrinsic Evaluation (3/4) • Effect of SenSim on synset replacement strategy • Baseline signifies the experiment where in there are no synset replacements Classification results of synset replacement experiment using different similarity metrics; * PP-Positive Precision (%), NP-Negative Precision(%), PR-Positive Recall (%), NR-Negative Recall (%)
Results- Extrinsic Evaluation (4/4) • Effect of SenSim on synset replacement strategy • Improvement is only marginal as no complex features are used for training the classifier Classification results of synset replacement experiment using different similarity metrics; * PP-Positive Precision (%), NP-Negative Precision(%), PR-Positive Recall (%), NR-Negative Recall (%)
Conclusions • Proposed that sentiment content can aid in similarity measurement, which to date has been done on the basis of meaning alone. • Verified this hypothesis by taking the correlation between annotators using different annotation strategies • Annotator correlation on including sentiment as an additional parameter for similarity measurement was higher than just semantic similarity • SenSim, based on this hypothesis, performs better than the existing metrics, which fail to account for sentiment while calculating similarity
References (1/2) • Balamurali, A., Joshi, A. & Bhattacharyya, P. (2011), Harnessing wordnet senses for supervised sentiment classification, in ‘Proc. Of EMNLP-2011’. • Banerjee, S. & Pedersen, T. (2002), An adapted lesk algorithm for word sense disambiguation using wordnet, in ‘Proc. of CICLing-02’. Banerjee, S. & Pedersen, T. (2003), Extended gloss overlaps as a measure of semantic relatedness, in ‘Proc. of IJCAI-03’. • Esuli, A. & Sebastiani, F. (2006), SentiWordNet: A publicly available lexical resource for opinion mining, in ‘Proceedings of LREC-06’, Genova, IT. • Grishman, R. (2001), Adaptive information extraction and sublanguage analysis, in ‘Proc. Of IJCAI-01’. • Hirst, G. & St-Onge, D. (1997), ‘Lexical chains as representation of context for the detection and correction malapropisms’. • Jiang, J. J. & Conrath, D. W. (1997), Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, in ‘Proc. of ROCLING X’. • Leacock, C. & Chodorow, M. (1998), Combining local context with wordnet similarity for word sense identification, in ‘WordNet: A Lexical Reference System and its Application’. • Leacock, C., Miller, G. A. & Chodorow, M. (1998), ‘Using corpus statistics and wordnet relations for sense identification’, Comput. Linguist. 24. • Lin, D. (1998), An information-theoretic definition of similarity, in ‘Proc. of ICML ’98’. Partington, A. (2004), ‘Utterly content in each others company semantic prosody and semantic preference’, International Journal of Corpus Linguistics 9(1).
References (2/2) • Patwardhan, S. (2003), Incorporating dictionary and corpus information into a context vector measure of semantic relatedness, Master’s thesis, University of Minnesota, Duluth. • Pedersen, T., Patwardhan, S. & Michelizzi, J. (2004), Wordnet::similarity: measuring the relatedness of concepts, in ‘Demonstration Papers at HLT-NAACL’04’. • Rada, R., Mili, H., Bicknell, E. & Blettner, M. (1989), ‘Development and application of a metric on semantic nets’, IEEE Transactions on Systems Management and Cybernetics 19(1). • Resnik, P. (1995a), Disambiguating noun groupings with respect to Wordnet senses, in ‘Proceedings of the Third Workshop on Very Large Corpora’, Somerset, New Jersey. • Resnik, P. (1995b), Using information content to evaluate semantic similarity in a taxonomy, in ‘Proc. of IJCAI-95’. • Richardson, R., Smeaton, A. F. & Murphy, J. (1994), Using wordnet as a knowledge base for measuring semantic similarity between words, Technical report, Proc. of AICS-94. • Sinclair, J. (2004), Trust the Text: Language, Corpus and Discourse, Routledge. • Wan, S. & Angryk, R. A. (2007), Measuring semantic similarity using wordnet-based context vectors., in ‘Proc. of SMC’07’. • Wu, Z. & Palmer, M. (1994), Verb semantics and lexical selection, in ‘Proc. of ACL-94’, New Mexico State University, Las Cruces, New Mexico. • Zhong, Z. & Ng, H. T. (2010), It makes sense: A wide-coverage word sense disambiguation system • for free text., in ‘ACL (System Demonstrations)’ 10’.
Metrics used for Comparison • LIN • uses the information content individually possessed by two concepts in addition to that shared by them. • Lesk • based on the overlap of words in their individual glosses • Leacock & Chodorow (LCH) • the shortest path through hypernymy relation simLCH (A,B) = -log ( len (A,B) 2 X log Pr (lso (A,B)) simLIN (A,B) = log Pr (A) + log Pr (B) 2D )
How? • Use of WordNet Similarity Metrics • Metrics used: • LIN (Lin, 1998) ,LCH (Leacock and Chodorow 1998), Lesk(Banerjee and Pedersen, 2002) Training corpus(Synset) Test corpus(Synset) Get Train_Synset_List If Test_Synset T is not in Train_synset_list Using similarity metric S find a similar synset R Yes Get Test_Synset_List No New Test corpus(Synset) Replace T with R
Experimental Setup • Datasets Used • Intrinsic Evaluation • Gold Standard Data containing 48 sense-marked word pairs • Extrinsic Evaluation • Dataset provided by Balamurali et al (2011) • Word Sense Disambiguation carried out using the WSD engine by Zhong & Ng (2010) (82% accuracy) • WordNet::Similarity 2.05 package used for computing similarity by other metric scores • Pearson Correlation Coefficient used to find inter-annotator agreement • Sentiment Classification done using C-SVM; all results are average of five-fold cross-validation accuracies