380 likes | 495 Views
Determining the Hierarchical Structure of Perspective and Speech Expressions. Eric Breck and Claire Cardie Cornell University Department of Computer Science. Events in the News. Reporting events. Reporting in text.
E N D
Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Cornell University Department of Computer Science
Events in the News Cornell University Computer Science COLING 2004
Reporting events Cornell University Computer Science COLING 2004
Reporting in text • Clappsums upthe environmental movement’s reaction: “The polluters are unreasonable’’ • Charlie was angry at Alice’s claim that Bob was unhappy Cornell University Computer Science COLING 2004
Perspective and Speech Expressions (pse’s) • A perspective expression is text denoting an explicit opinion, belief, sentiment, etc. • The actor was elated that … • John’sfirm belief in … • A speech expression is text denoting spoken or written communication • … arguedthe attorney ... • … the 9/11 Commission’s finalreport … Cornell University Computer Science COLING 2004
writer (implicit) Charlie angry Alice claim Bob unhappy Grand Vision Charlie was angry at Alice’s claim that Bob was unhappy that Bob was unhappy Cornell University Computer Science COLING 2004
(implicit) angry claim unhappy This Work Cornell University Computer Science COLING 2004
(implicit) angry claim unhappy System Output: Pse Hierarchy 78% accurate! Charlie was angry at Alice’s claim that Bob was unhappy Cornell University Computer Science COLING 2004
Related Work: Abstract • Bergler, 1993 • Lexical semantics of reporting verbs • Gerard, 2000 • Abstract model of news reader Cornell University Computer Science COLING 2004
Related Work: Concrete • Bethard et al., 2004 • Extract propositional opinions & holders • Wiebe, 1994 • Tracks “point of view” in narrative text • Wiebe et al., 2003 • Preliminary results on pse identification • Gildea and Jurafsky, 2002 • Semantic Role ID - use for finding sources? Cornell University Computer Science COLING 2004
Only 66% correct unhappy unhappy unhappy Baseline 1: Only filter through writer (implicit) angry claim Cornell University Computer Science COLING 2004
claim claim unhappy unhappy Baseline 2: Dependency Tree (implicit) angry 72% correct claim unhappy Cornell University Computer Science COLING 2004
A Learning Approach • How do we cast the recovery of hierarchical structure as a learning problem? • Simplest solution • Learn pairwise attachment decisions • Is pseparent the parent of psetarget? • Combine decisions to form tree • Other solutions are possible (n-ary decisions, tree-modeling, etc.) Cornell University Computer Science COLING 2004
Training instances (implicit) angry claim unhappy Cornell University Computer Science COLING 2004
Training instances (implicit) <unhappy, (implicit)> angry claim unhappy Cornell University Computer Science COLING 2004
Training instances (implicit) <unhappy, (implicit)> <claim, (implicit)> angry claim unhappy Cornell University Computer Science COLING 2004
Training instances (implicit) <unhappy, (implicit)> <claim, (implicit)> <angry, (implicit)> angry claim unhappy Cornell University Computer Science COLING 2004
Training instances (implicit) <unhappy, (implicit)> <claim, (implicit)> <angry, (implicit)> <unhappy, claim> <claim, unhappy> angry claim unhappy Cornell University Computer Science COLING 2004
Training instances (implicit) <unhappy, (implicit)> <claim, (implicit)> <angry, (implicit)> <unhappy, claim> <claim, unhappy> angry claim unhappy <unhappy, angry> <angry, unhappy> Cornell University Computer Science COLING 2004
Training instances (implicit) <unhappy, (implicit)> <claim, (implicit)> <angry, (implicit)> <unhappy, claim> <claim, unhappy> angry claim unhappy <unhappy, angry> <angry, unhappy> <angry, claim> <claim, angry> Cornell University Computer Science COLING 2004
Decision Combination (implicit) angry claim unhappy Cornell University Computer Science COLING 2004
Decision Combination (implicit) angry 0.9 <angry, (implicit)> 0.1 <angry, claim> 0.1 <angry, unhappy> angry claim unhappy Cornell University Computer Science COLING 2004
Decision Combination (implicit) angry claim unhappy Cornell University Computer Science COLING 2004
Decision Combination (implicit) claim 0.5 <claim, (implicit)> 0.4 <claim, angry> 0.3 <claim, unhappy> angry claim unhappy Cornell University Computer Science COLING 2004
Decision Combination (implicit) angry claim unhappy Cornell University Computer Science COLING 2004
Decision Combination (implicit) unhappy 0.7 <unhappy, claim> 0.5 <unhappy, (implicit)> 0.2 <unhappy, angry> angry claim unhappy Cornell University Computer Science COLING 2004
Decision Combination (implicit) angry claim unhappy Cornell University Computer Science COLING 2004
Features(1) • All features based on error analysis • Parse-based features • Domination+ variants • Positional features • Relative position of pseparent and psetarget Cornell University Computer Science COLING 2004
Features(2) • Lexical features • writer’s implicit pse • “said” • “according to” • part of speech • Genre-specific features • Charlie, shenoted, dislikes Chinese food. • “Alicedisagrees with me,” Bobsaid. Cornell University Computer Science COLING 2004
Resources • GATE toolkit (Cunningham et al, 2002) - part-of-speech, tokenization, sentence boundaries • Collins parser (1999) - extracted dependency parses • CASS partial parser (Abney, 1997) • IND decision trees (Buntine, 1993) Cornell University Computer Science COLING 2004
Data • From the NRRC Multi-Perspective Question Answering workshop (Wiebe, 2002) • 535 newswire documents (66 for development, 469 for evaluation) • All pse’s annotated, along with sources and other information • Hierarchical pse structure annotated for each sentence* Cornell University Computer Science COLING 2004
Example (truncated) model • One learned tree, truncated to depth 3: • pse0 is parent of pse1 iff • pse0 is (implicit) • And pse1 is not in quotes • OR pse0 is said • Typical trees on development data: • Depth ~20, ~700 leaves Cornell University Computer Science COLING 2004
Evaluation • Dependency-based metric (Lin, 1995) • Percentage of pse’s whose parents are identified correctly • Percentage of sentences with perfectly identified structure • Performance of binary classifier Cornell University Computer Science COLING 2004
Results Cornell University Computer Science COLING 2004
Error Analysis • Pairwise decisions prevent the model from learning larger structure • Speech events and perspective expressions behave differently • Treebank-style parses don’t always have the structure we need Cornell University Computer Science COLING 2004
Future Work • Identify pse’s • Identify sources • Evaluate alternative structure-learning methods • Use the structure to generate perspective-oriented summaries Cornell University Computer Science COLING 2004
Conclusions • Understanding pse structure is important for understanding text • Automated analysis of pse structure is possible Cornell University Computer Science COLING 2004
Thank you! Cornell University Computer Science COLING 2004