320 likes | 489 Views
Peer Review in Online Forums: Classifying Feedback-Sentiment. Greg Harris, Anand Panangadan, and Viktor K. Prasanna. University of Southern California. Outline. Introduction to Feedback-Sentiment Slashdot Dataset Interactive Decision Tree Experiment Results.
E N D
Peer Review in Online Forums: Classifying Feedback-Sentiment Greg Harris, Anand Panangadan, and Viktor K. Prasanna University of Southern California
Outline • Introduction to Feedback-Sentiment • Slashdot Dataset • Interactive Decision Tree • Experiment Results
Discussion Forums as a Source of Information • Forums are rich with both information and misinformation. • Clues as to the accuracy of information can be found in replies.
What is feedback-sentiment? It is the sentiment in a forum reply directed either toward the parent comment, or toward the author of the parent comment. Examples include: • (dis)agreeing with a comment/author • showing appreciation • insulting the author • questioning/expressing doubt • listing a counterexample
What is feedback-sentiment? Negative example: Citation needed. Just 'cause something has been “known for decades” doesn't make it so. Positive example: Yeah, they’ve got the worst customer service ever. Negativeexample: Um.... Yeah, it is.
Where can feedback-sentiment be used? • Fact validation by peer review • Answer selection/validation • Reputation analysis • Expert identification • Monitoring forum health
Outline • Introduction to Feedback-Sentiment • Slashdot Dataset • Interactive Decision Tree • Experiment Results
Dataset Slashdot.org, “News for nerds. Stuff that matters.” • All news summaries (100 thousand) and comments (25 million) spanning June 26, 2002, through August 31, 2013. • Nearly 5 million comments initiate new discussion threads, the rest are replies. • Dataset available at http://gregharris.info
First-Sentence Heuristic • The first sentence in a reply is most likely to contain the sentiment of the author toward the parent comment/author.
Challenges • Unsupervised methods did not work for us: • We used contrast set mining to find phrases that more commonly appear in first-sentences. There were too many to label, and the context was missing. • The semantic orientation of a phrase is difficult to infer based on statistical association. We started with a seed phrase and calculated the pointwise mutual information (PMI) with other phrases, as in Turney (2002). • We looked for coherency when replies contained multiple snippets of quoted text. This happened when a reply refuted the parent comment point-by-point. • We looked for coherency in all replies by the same person. • We tried using principal component analysis (PCA) to see if the first one or two principal components could identify feedback-sentiment. • We looked for association of feedback-sentiment words and phrases with profanity.
Challenges • Response patterns change over time:
Challenges Each forum is different. Slashdot has its own idiosyncrasies: • Poor spelling, grammar, capitalization, and punctuation • Informal • Ad hominem attacks • +1 • MOD PARENT UP • RTFA (read the full article) • You must be new here.
Outline • Introduction to Feedback-Sentiment • Slashdot Dataset • Interactive Decision Tree • Experiment Results
Interactive Decision Tree • Fast way to explore the data • Focus on most common response patterns See demo...
Some Useful Features • Starts with “no” (No, Nope, Not, Nonsense, Nothing, ...) • Ends in ? or ... • Yelling through all-caps • Profanity • Ends in !
Outline • Introduction to Feedback-Sentiment • Slashdot Dataset • Interactive Decision Tree • Experiment Results
Baselines authors-first-sent (authors’ annotation of first sentences) mturk-first-sent (Mechanical Turk annotation of first sentences) mturk-fulltext (Mechanical Turk annotation of full replies) lex-first-sent (first sentence classifier based on word counts from lexicon) lex-fulltext (full text classifier based on word counts from lexicon) opfin-first-sent (OpinionFinder 2.0 run on first sentences) opfin-fulltext(OpinionFinder 2.0 run on full text) RNTN-first-sent (Recursive Neural Tensor Network run on first sentences) dtree-first-sent (first sentence classifier trained on decision tree) dtree-fulltext (dtree-first-sent applied to each sentence in full text)
Results Turkers showed agreement of 58%.
Questions? gfharris@usc.edu http://gregharris.info This work is supported by Chevron U.S.A. Inc. under the joint project, Center for Interactive Smart Oilfield Technologies (CiSoft), at the University of Southern California. back to “unsupervised”
Related Work • Hassan et al. (2012), Detecting subgroups in online discussions by modeling positive and negative relations among participants. • Hassan et al. (2012), What’s with the attitude?: identifying sentences with attitude in online discussions. • Danescu-Niculescu-Mizil et al. (2013), A computational approach to politeness with application to social factors. • Sood et al. (2012), Automatic identification of personal insults on social news sites. • Musat et al. (2013), Direct negative opinions in online discussions. • Janin et al. (2003), The ICSI meeting corpus. • Hillard et al. (2003), Detection of agreement vs. disagreement in meetings: Training with unlabeled data. • Galley et al. (2004), Identifying agreement and disagreement in conversational speech: Use of bayesian networks to model pragmatic dependencies. • Hahn et al. (2006), Agreement/disagreement classification: Exploiting unlabeled data using contrast classifiers. • Germesin and Wilson (2009), Agreement detection in multiparty conversation.