Unsupervised Sentiment Classification Across Domains Soumya Ghosh Matthew Koch Sentiment Classification What is sentiment classification? In this study we are primarily interested in polarity. Motivation Classifying product reviews into “recommended” or “not recommended”
Sentiment Classification • What is sentiment classification? • In this study we are primarily interested in polarity.
Motivation • Classifying product reviews into “recommended” or “not recommended” • Web search engine can allow a user to narrow search to pages with positive or negative comments • Email Filtering • Categorizing news articles into positive and negative views
Background • Turney (2002) uses unsupervised techniques (PMI) for review polarity classification. • Accuracy: 84% (automobile reviews) to 66% (movie reviews) • Pang (2002/2004) uses supervised learning (SVMs) to achieve 82.9% accuracy on the movie domain.
Turney’s Approach • Select a bunch of reference words likely to indicate sentiment orientation. • “excellent” and “poor”. • Calculate Pointwise mutual information between words in the document to be classified and the reference words.
Pointwise Mutual Information • Ratio between the observed co-occurrence probability for the two words and the co-occurrence probability one would expect to see if the two words were independent . • Measure of statistical dependence. • PMI > 0 if the two words occur together more frequently than expected by chance. • PMI < 0 if the two words occur together less frequently than expected by chance.
Turney’s Approach • Compute semantic orientations of document phrases: • SO is positive if the word (phrase) is strongly associated to “excellent” and negative when it is associated with “poor”.
Caveats • Word statistics are captured using altavista. • SO is computed over phrases rather than individual words.
Supervised Approaches • Pang et al. compare a bunch of supervised learning algorithms for polarity classification. • They explore a wide variety of feature vectors for this purpose. • SVM performs the best, with word presence features.
Problems with previous approaches • Domain dependent (Engstrom 2004). • Supervised techniques -- document (review) level classifications. These work really poorly when faced with sentence level sentiment classification.
Our Work • We are interested in classifying quotes on the basis of their polarity and hence infer a figure’s stand on a particular issue. • Sentence Level Classification problem. • Try to make our classification model domain independent.
Our Approach • Use Pointwise Mutual Information to score word sentiments in a non-sparse corpus. • Select the top p percentile and the bottom q percentile words. • Most sentimentally oriented words. Lets call this SO. • Extract most sentimentally oriented words from the sparse target corpus ( bunch of sentences to be classified). • This is our development set. • We simply select all adjectives from this corpus.
Our Approach • For each word wi in the dev set do: • Use wordnet to find its synset. • Assign to every synonym it’s score from SO; if synonym is not present in SO a score of 0 is assigned to it. • Score of wi = mean scores of the synset of wi. • Sentence scores are the average scores of their words.
Our Approach • Determine the pos/neg score decision boundary from the distribution of the development set. • Assume that the development set is representative of the whole sample. • In our case this boils down to using the median value of the sorted sentence-level scores.
Data • 2000 (1000 pos/1000 neg) labeled document level movie review data. (Cornell -- IMDB) • 3020 labeled sentence level movie review data. (Yahoo movies) • 100 labeled sentence level data from WSJ.
Experiments • Step 1: • Learn using the IMDB data and classify the movie review data. • Step 2: • Classify the sentences from WSJ using the same model.
Results • As a baseline we follow Pang’s supervised learning method. • We randomly split the Yahoo corpus into a test set of 420 sentences and a train set of 2600 sentences. • Accuracy over 10 runs : 63.34%, std: 1.78
Results • Next we trained up the same SVM model on the IMDB corpus. (2000 labeled documents) and tested on the random set of 420 sentences. • Accuracy over 10 runs : 62.89%, std: 1.57 • Difference is not statistically significant. • Implies high correlation between the domains.
Results • Turney’s algorithm on the Yahoo movie corpus: • Accuracy over 10 runs : 56.86%, std: 0.04 • Results are statistically worse than the supervised techniques at 95% confidence.
Results • Our Results on the Yahoo movie corpus • Averaged over 10 runs • Accuracy : 69.80%, std:0.02 • Statistically significant over previous techniques at 95% confidence • Our Results on the WSJ corpus • Averaged over 10 runs • Accuracy : 60.00%, std: 0.001
Our Results (Correct Classifications). • the dollar strengthened Friday as stocks rocketed – positive. • Kidder Peabody reiterated its buy recommendation on the stock – positive. • But the thing it's supposed to measure -- manufacturing strength -- it missed altogether last month - negative. • I just don't feel the extra yield you're getting is worth it to justify all the risks you're taking - negative
Misclassifications. • On past occasions, her finely textured singing has been ample compensation for her mannered gestures, but on this big night -- her first Met opening -- only some of her pianissimos were skillfully deployed. – we predict this is positive. • IBM is five times the size of Digital and is awesome – we predict this is negative.
Conclusion • Our unsupervised algorithm outperforms standard sentiment classification techniques. • Encouraging results for limited cross-domain sentiment classification using knowledge transfer.
Future Work • Look into semantic vocabulary mining. • Explore other wordnet relationships for knowledge transfer. • Take a look at the algorithm from a more theoretical point of view.
References • Peter D. Turney: Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, ACL 2002. • Bo Pang, Lillian Lee, Shivakumar Vaithyanathan: Thumbs up? Sentiment Classification using Machine Learning Techniques, CoRR cs.CL/0205070, 2002. • Bo Pang, Lillian Lee: A Sentimental Education:Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. ACL 2004. • Charlotta Engstrom: Topic Dependence in Sentiment Classification. Master’s thesis, University of. Cambridge, July 2004.