1 / 20

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. Bo Pang , Lillian Lee Department of Computer Science Cornell University Proceeding of the ACL , 2004. Abstract.

bette
Download Presentation

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts Bo Pang , Lillian Lee Department of Computer Science Cornell University Proceeding of the ACL , 2004

  2. Abstract • Sentiment analysis seeks to identify the viewpoint(s) underlying a text span . • An example application : Classifying a movie review as positive (thumbs up) or negative (thumbs down) . • To determine this sentiment polarity, they propose a novel machine-learning method that applies text-categorization techniques to just subjective portions of the document . • Extracting subjective portions can be implemented by finding minimum cuts in graphs .

  3. Introduction • Previous approaches focused on selecting indicative lexical features (e.g., the word “good”) . Then , classifying a document according to the number of such features that occur anywhere within it . • In contrast , they propose the following process: • (1) label the sentences in the document as either subjective or objective, discarding the objective . (2) apply a standard machine-learning classifier to the resulting extract . • This approach can prevent the polarity classifier from considering irrelevant or even misleading text .

  4. Architecture • Polarity classification via subjectivity detection : • Subjectivity detector determines whether each sentence is subjective or objective . Discarding the objective sentences creates an extract .

  5. Cut-based classification(1/5) • In text , sentences occurring near each other may share the same subjectivity status . (Wiebe,1994) • Suppose we have n items x1, . . . , xn to divide into two classes C1 and C2, and have to use two types of information: • Individual scores indj(xi): Non-negative estimates of each xi’s preference for being in Cj (C1 or C2) based on just the features of xi alone . • Association scores assoc(xi, xk): Non-negative estimates of how important it is that xi and xk be in the same class.

  6. Cut-based classification(2/5) • We want to maximize each item’s “net happiness”: its individual score for the class it is assigned to, minus its individual score for the other class . • But, we also want to penalize putting tightly-associated items into different classes. • After some algebra, we arrive at the following optimization problem: • Assign the xis to C1 and C2 so as to minimize the partition cost

  7. Cut-based classification(3/5) • Suppose the situation in the following manner : • Build an undirected graph G with vertices {v1, . . . , vn, s, t}; the last two are ,respctively, the source and sink. • Add n edges (s, vi), each with weight ind1(xi), and n edges (vi, t), each with weight ind2(xi) . • Finally, add C(n,2) edges (vi, vk), each with weight assoc(xi, xk) . • Then , cuts in G are defined as follows : • A cut (S, T) of G is a partition of its nodes into sets S = {s} v S’and T = {t} v T’, where s isn’t in S‘ , t isn’t in T’. • Its cost cost(S, T) is the sum of the weights of all edges crossing from S to T. • A minimum cut of G is one of minimum cost.

  8. Cut-based classification(4/5) • Thus, our optimization problem reduces to finding minimum cuts . • Every cut corresponds to a partition of the items and has cost equal to the partition cost. • Formulating subjectivity-detection problem in terms of graphs allows us to model item-specific (a sentence) and pairwise information (proximity relationships between sentences) independently . • A crucial advantage to the minimum-cut-based approach is that we can use maximum-flow algorithms with polynomial asymptotic running times .

  9. Cut-based classification(5/5) • Based on individual scores alone, we would put Y (“yes”) in C1, N (“no”) • in C2, and be undecided about M (“maybe”) . • But the association scores favor cuts that put Y and M in the same class. • Thus, the minimum cut, indicated by the dashed line, places M together • with Y in C1.

  10. Evaluation Framework (1/4) • Polarity dataset (Corpus) : • Their data contains 1000 positive and 1000 negative reviews all written before 2002 . • Default polarity classifiers : • They tested support vector machines (SVMs) and Naive Bayes (NB) . • They use unigram-presence features : The ith coordinate of a feature vector is 1 if the corresponding unigram occurs in the input text, 0 otherwise. • Each default document level polarity classifier is trained and tested on the extracts . • Extracts formed by applying the sentence level subjectivity detectors to reviews in the polarity dataset .

  11. Evaluation Framework (2/4) • Subjectivity dataset : • To train sentence-level subjectivity detectors, we need a collection of labeled sentences . • They collect 5000 subjective sentences and 5000 objective sentences for training . • Subjectivity detectors : • For a given document, we build a graph wherein the source s and sink t correspond to the class of subjective and objective sentences . • Each internal node vi corresponds to the document’s ith sentence si . • We can set the individual scores ind1(si) to and ind2(si) to .

  12. Evaluation Framework (3/4) • We decide the degree of proximity between pairs of sentences, controlled by three parameters. • Threshold T : the maximum distance two sentences can be separated by and still be considered proximal. • Function f(d) : specifies how the influence of proximal sentences decays with respect to distance d . • Constant c : controls the relative influence of the association scores: a larger c makes the minimum-cut algorithm more loath to put proximal sentences in different classes.

  13. Evaluation Framework (4/4) • Graph-cut-based creation of subjective extracts :

  14. Experiment Result (1/6) • They experiment by ten-fold cross-validation over the polarity dataset . • Basic subjectivity extraction : • Both Naive Bayes and SVMs are trained on our subjectivity dataset and then used as a basic subjectivity detector . • Employing Naive Bayes as a subjectivity detector (ExtractNB) in conjunction with a Naive Bayes document-level polarity classifier achieves 86.4% accuracy . • This is a clear improvement over the 82.8% that results when no extraction is applied (Full review) . • With SVMs as the polarity classifier instead, the Full review performance rises to 87.15% . • But comparison via the paired t-test reveals that this is statistically indistinguishable from the 86.4% that is achieved by running the SVM polarity classifier on ExtractNB input.

  15. Experiment Result (2/6) • Basic subjectivity extraction : • These findings indicate that the extracts preserve the sentiment information in the originating documents . Thus , extracts are good summaries from the polarity-classification point of view. • “Flipping” experiment: If we give as input to the default polarity classifier an extract consisting of the sentences labeled objective, accuracy drops dramatically to 71% for NB and 67% for SVMs. • This confirms that sentences discarded by the subjectivity extraction process are indeed much less indicative of sentiment polarity.

  16. Experiment Result (3/6) • Basic subjectivity extraction : • Moreover, the subjectivity extracts contain on average only about 60% of the source reviews’ words . • They experment with a. the N most subjective sentences from the originating review b. the first N sentences ( authors often begin documents with an overview ) c. the last N sentences ( concluding material may be a good summary ) d. the N least subjective sentences

  17. Experiment Result (4/6) • Basic subjectivity extraction : Accuracies using N-sentence extracts for NB (left) and SVM (right) default polarity classifiers.

  18. Experiment Result (5/6) • Incorporating context information : • We now examine whether context information, particularly regarding sentence proximity, can further improve subjectivity extraction. • ExtractNB+Prox and ExtractSVM+Prox are the graph-based sentence subjectivity detectors using Naive Bayes and SVMs .

  19. Experiment Result (6/6) • Word preservation rate vs. accuracy, NB (left) and SVMs (right) as default polarity classifiers :

  20. Conclusion • They show that the subjectivity extracts accurately represent the sentiment information of the originating documents in a much more compact form . • They can achieve significant improvement (from 82.8% to 86.4%) retaining only 60% of the reviews’ words. • Minimum cut formulation integrates inter-sentence level contextual information with traditional bag-of words features effectively . • Their future research include incorporating other sources of contextual cues , and investigating other means for modeling such information.

More Related