Bo Pang and Lillian Lee Cornell University Carnegie Mellon University ACL 2005

Seeing stars: Exploiting class relationships for sentiment categorization withrespect to rating scales Bo Pang and Lillian Lee Cornell University Carnegie Mellon University ACL 2005

About this problem • To label scales • Differ from “thumbs up” or not • Differ from identifying opinion strength • Differ from ranking (+classification) • Movie reviews from Rotten Tomatoes • Study on human subjects • Three algorithms

Problem validation and formulation (1) • Check how human performs to compare with machine’s performance • Use reviews of one author to factor out the effects of cross-author divergence • A notch equals half star/four or five stars; 10 points/100 points • Random-choice baseline 33%

Problem validation and formulation (2) • A three-class task seems like one that most people would do quite well at. • For balance issue, reduce their problem from 5-class to 4-class

A scale dataset • Movie reviews from four corpora • Remove rating indicators • Remove objective sentences • A total of 1,770, 902, 1,307, 1,027 documents of four authors

Algorithm (1) • Using SVMlight package • Algorithm 1: One-vs-all (OVA) • An SVM binary classifier distinguishing label l to label not-l • Algorithm 2: Regression • Find the hyperplane best fits the training data (within distance epsilon incur no loss) • Similar items, similar labels

Algorithm (2) • Algorithm 3: Metric labeling • Algorithm 1 or 2 + Similarity measure • Distance metric on labels • K nearest neighbors of item x according to sim • Item-similarity function sim • Locally-weighted learning

Algorithm (3) • Finding a label-correlated item-similarity function: vocabulary overlap (ex. Cosine) is not suitable.

Algorithm (PSP) • Using PSP (positive-sentence percentage) • A NB classifier trained on 10,062 movie-review snippets (exact one sentence long striking) • Apply this classifier on their test data

Algorithm (PSP) = Distinguish terms: appear more than 20 times and appear in a single class 50% or more

Experiment Results (1)

Experiment Results (2) • Adding PSP is useful, however, PSP it self is not good enough.

Multi-authors • Get comparable results

Future Work • Varying the kernel in SVM • Use mixture models (combine “positive” and “negative” language models) to capture class relationships. • Multi-class but no-scale-based categorization problem (positive vs. negative vs. neutral) • Transductive setting (a small amount of labeled data and uses relationships between unlabeled items), well-suited to the metric-labeling approach

Bo Pang and Lillian Lee Cornell University Carnegie Mellon University ACL 2005

Bo Pang and Lillian Lee Cornell University Carnegie Mellon University ACL 2005

Presentation Transcript

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University