260 likes | 448 Views
Automatic sense prediction for implicit discourse relations in text. Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009. Implicit discourse relations. Explicit comparison I am in Singapore, but I live in the United States. Implicit comparison
E N D
Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009
Implicit discourse relations • Explicit comparison • I am in Singapore, but I live in the United States. • Implicit comparison • The main conference is over Wednesday. I am staying for EMNLP. • Explicit contingency • I am here because I have a presentation to give at ACL. • Implicit contingency • I am a little tired; there is a 13 hour time difference.
Related work • Soricut and Marcu (2003) • Sentence level • Wellner et al. (2006) • used GraphBank annotations that do not differentiate between implicit and explicit. • difficult to verify success for implicit relations. • Marcu and Echihabi (2001) • Artificial implicit • Delete the connective to generate dataset. • [Arg1, but Arg2] => [Arg1, Arg2]
Word pairs investigation • The most easily accessible features are the words in the two text spans of the relation. • There is some relationship that hold between the words in the two arguments. • The recent explosion of country finds mirrors the “closed-end fund mania” of the 1920s. Mr. Foot says, when narrowly focused funds grew wildly popular. They fell into oblivion after the 1929 crash. • Popular(受歡迎)and oblivion (被遺忘) are almost antonyms. • Triggers the contrast relation between the sentences.
Word pairs selection • Marcu and Echihabi (2001) • Only nouns, verbs, and others cue phrases. • Using all words were superior to those based on only non-functions words. • Lapata and Lascarides (2004) • Only verbs, nouns, and adjectives. • Verb pairs are one of best features. • No useful information was obtained using nouns and adjectives. • Blair-Goldensohn et al. (2007) • Stemming. • Small vocabulary. • Cutoff on the minimum frequency of a feature. • Filtering stop-words has a negative impact on the results.
Analysis of word pair features • Finding the word pairs with highest information gain on the synthetic data. • The government says it has reached most isolated townships by now, butbecause roads are blocked, getting anything but basic food supplies to people remains difficult. • Remove but => comparison example • Remove because => contingency example
Features for sense prediction • Polarity tags • Inquirer tags • Money/Percent/Number • Verbs • First-last/first 3 words • Context
Polarity Tags pairs • Similar to word pairs, but words replaced with polarity tags. • Each word’s polarity was assigned according to its entry in the Multi-perspective Question Answering Opinion Corpus (Wilson et al., 2005) • Each sentiment word is tagged as positive, negative, both, or neutral. • Using the number of negated and non-negated positive, negative, and neutral sentiment word in the two spans as features. • Executives at Time Inc. Magazine Co., a subsidiary of Time Warner, have said the joint venture with Mr. Lang wasn’t a good one. [NegatedPositive] • The venture, formed in 1986, was supposed to be Time’s low-cost, safe entry into women’s magazines. [Positive]
Inquirer Tags • Look up what semantic categories each word falls into according to the General Inquirer lexicon (Stone et al., 1966). • See more observation for each semantic class than for any particular word, reducing the data sparsity problem. • Complementary classes • “Understatement” vs. “Overstatement” • “Rise” vs. “Fall” • “Pleasure” vs. “Pain” • Only verbs.
Money/Percent/Num • If two adjacent sentences both contain numbers, dollar amounts, or percentages, it is likely that a comparison relation might hold between the sentences. • Count of numbers, percentages, and dollar amounts in the two arguments. • Number of times each combination of number/percent/dollar occurs in the two arguments. • Newsweek's circulation for the first six months of 1989 was 3,288,453, flat from the same period last year • U.S. News' circulation in the same time was 2,303,328, down 2.6%
Verbs • Number of pairs of verbs in Arg1 and Arg2 from the same verb class. • Two verbs are from the same verb class if each of their highest Levin verb class levels are the same. • The more related the verbs, the more likely the relation is an Expansion. • Average length of verb phrases in each argument • They [are allowed to proceed] => Contingency • They [proceed] => Expansion, Temporal • POS tags of the main verb • Same tense => Expansion • Different tense => Contingency, Temporal
First-Last, First3 • Prior work found first and last words very helpful in predicting sense • Wellner et al., 2006 • Often explicit connectives
Context • Some implicit relations appear immediately before or immediately after certain explicit relations. • Indicating if the immediately preceding/following relation was an explicit. • Connective • Sense of the connective • Indicating if an argument begins a paragraph.
Dataset • Penn Discourse Treebank • Largest available annotated corpus of discourse relations • Penn Treebank WSJ articles • 16,224 implicit relations between adjacent sentences • I am a little tired;[because] there is a 13 hour time difference. • Contingency.cause.reason • Use only the top level of the sense annotations.
Top level discourse relations • Comparison(轉折) • 但是、可是、卻、即使、竟然、然而…… • Contingency(因果) • 因為、由於、因此、於是…… • Expansion(並列) • 又、並且、而且…… • Temporal (時序) • 在此之前、之後……
Experiment setting • Developed features on sections 0-1 • Trained on sections 2-20 • Tested on sections 21-22 • Binary classification task for each sense • Trained on equal numbers of positive and negative examples • Tested on natural distribution • Naïve Bayes classifier
Results: comparison Polarity is actually the worst feature 16.63
Results: expansion • Expansion is majority class • precision more problematic than recall • These features all help other senses
Results: temporal Temporals often end with words like “Monday” or “yesterday”
Best feature sets • Comparison • Selected word pairs. • Contingency • Polarity, verb, first/last, modality, context, selected word pairs. • Expansion • Polarity, inquirer tags, context. • Temporal • First/last, selected word pairs.
Sequence model for discourse relations • Tried Conditional random field classifier.
Conclusion • First study that predicts implicit discourse relations in a realistic setting. • Better understanding of word pairs. • The feature in fact do not capture opposite semantic relation but rather give information about function word co-occurrences. • Empirical validation of new and old features. • Polarity, verb classes, context, and some lexical features indicate discourse relations.