190 likes | 347 Views
Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon. Problem. Semi supervised sarcasm identification using SASI
E N D
Semi-Supervised Recognition of Sarcastic Sentencesin Twitter and Amazon
Problem • Semi supervised sarcasm identification using SASI • Sarcasm: the activity of saying or writing the opposite of what you mean, or of speaking in a way intended to make someone else feel stupid or show them that you are angry
Datasets • Twitter Dataset: • Tweets are 140 characters or fewer • Tweets can contain urls, references to other tweeters (@<user>) or hashtags #<tag> • Slang, abbreviations, and emoticons are common • 5.9 million tweets • 14.2 average words per tweet • 18.9% include a url, 35.3% contain @<user> • 6.9% contain one or more hashtags
Datasets • Amazon Dataset: • 66,000 reviews of 120 products • 953 characters on average • Usually structured and grammatical • Have fields including writer, date, rating, and summary • Amazon reviews have a great deal of context compared to tweets
Classification • The algorithm is semi-supervised • Seeded with a small group of labeled sentences • The seed is annotated with a sarcasm ranking in [1,5] • Syntatic and pattern based features are used to build a classifier
Data Preprocessing • Specific information was replaced with general tags to facilitate pattern matching: • ‘[PRODUCT]’,‘[COMPANY]’,‘[TITLE]’ • ‘[AUTHOR]’ ‘[USER]’,‘[LINK]’ and • ‘[HASHTAG]’. • All HTML tags removed
Pattern Extraction and Selection • Words are classified into high frequency words (HFW) and (CW) • A pattern is an ordered sequence of HFWs and slots for CWs • “[COMPANY] CW does not CW much” • Generated patterns were removed if they were present in two seeds with rankings 1 and 5 • Patterns were removed which appear only in reference to a single product
Other Features • (1) Sentence length in words, • (2) Number of “!” characters in the sentence • (3) Number of “?” characters in the sentence • (4) Number of quotes in the sentence • (5) Number of capitalized/all capitals words in the sentence.
Data Enrichment • Assumption: Sentences near a sarcastic sentence are similarly sarcastic • Using the seed set for the Amazon data, perform a yahoo search for text snippets containing the seeds. Include the surrounding sentences in the seed, annotated similarly to the original search parameters
Classification • Similar to kNN • The score for a new instance is the weighted average of the k nearest training set vectors, measured using Euclidean distance
Baseline • Assume sarcasm implies saying the opposite of what you mean • Identify reviews with few stars and decide that sarcasm is present if strongly positive words appear in the review
Training Sets • Amazon: • 80 positive and 505 negative examples • (471/5020 expanded) • Twitter • 1500 #sarcasm hash tagged tweets (Noisy) • Changed to be positive examples from the Amazon dataset and manually selected negative examples from the Twitter dataset
Test Sets • 90 positive and 90 negative examples each for Amazon and Twitter • Only sentences containing a named entity or named entity reference were sampled (more likely to contain sentiment → relevance) • Non-sarcastic sentences belong only to negative reviews, increasing the chance that they contain negative sentiment • MTurk used to create a gold standard for the test set. Each sentence was annotated by 3 annotators.
Inter-Annotator Agreement • Amazon: k = 0.34 • Twitter: k = 0.41 • Superior Twitter agreement is attributed to lack of context in the medium
Baseline Intuitions • The Baseline has high precision, but low recall • It cannot recognize subtly sarcastic sentences • These results imply that the definition “saying the opposite of what you mean” is not a good indicator of sarcasm
Reasons for Good Twitter Results • Robustness of sparse and incomplete pattern matching • SASI learns a model with a feature space spanning over 300 dimensions • Sarcasm may be easier to detect in tweets because tweeters have to go out of their way to make sarcasm explicit in an environment with no context
Notes • #sarcasm tags were unreliable • Punctuation marks were the weakest predictors, in contrast to the findings of Teppermann et al. (2006) • The exception to this is the use of ellipses, which was a strong predictor in combination with other features