220 likes | 361 Views
Putting Meaning Into Your Trees. Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben Snyder PropBanks I and II site visit University of Pennsylvania, October 30, 2003. Powell met Zhu Rongji. battle. wrestle. join. debate.
E N D
Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben Snyder PropBanks I and II site visit University of Pennsylvania, October 30, 2003
Powell met Zhu Rongji battle wrestle join debate Powell and Zhu Rongji met consult Powell met with Zhu Rongji Proposition:meet(Powell, Zhu Rongji) Powell and Zhu Rongji had a meeting Proposition Bank:From Sentences to Propositions meet(Somebody1, Somebody2) . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))
Capturing semantic roles* • JK broke [ ARG1 the LCD Projector.] • [ARG1 The windows] were broken by the hurricane. • [ARG1 The vase] broke into pieces when it toppled over. SUBJ SUBJ SUBJ *See also Framenet, http://www.icsi.berkeley.edu/~framenet/
Outline • Introduction • Proposition Bank • Starting with Treebanks • Frames files • Annotation process and status • PropBank II • Automatic labelling of semantic roles • Chinese Proposition Bank
(S (NP-SBJ Analysts) • (VP have • (VP been • (VP expecting • (NP (NP a GM-Jaguar pact) • (SBAR (WHNP-1that) • (S (NP-SBJ *T*-1) • (VP would • (VP give • (NP the U.S. car maker) • (NP (NP an eventual (ADJP 30 %) stake) • (PP-LOC in (NP the British company)))))))))))) VP have been VP expecting SBAR NP a GM-Jaguar pact WHNP-1 that VP give NP Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company. NP the US car maker NP an eventual 30% stake in the British company A TreeBanked Sentence S VP NP-SBJ Analysts NP S VP NP-SBJ *T*-1 would NP PP-LOC
(S Arg0 (NP-SBJ Analysts) • (VP have • (VP been • (VP expecting • Arg1 (NP (NP a GM-Jaguar pact) • (SBAR (WHNP-1that) • (S Arg0 (NP-SBJ *T*-1) • (VP would • (VP give • Arg2 (NP the U.S. car maker) • Arg1 (NP (NP an eventual (ADJP 30 %) stake) • (PP-LOC in (NP the British company)))))))))))) a GM-Jaguar pact Arg0 that would give Arg1 *T*-1 an eventual 30% stake in the British company Arg2 the US car maker expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake) The same sentence, PropBanked have been expecting Arg1 Arg0 Analysts
Frames File Example: expect Roles: Arg0: expecter Arg1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg0: Portfolio managers REL: expect Arg1: further declines in interestrates
Frames File example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefsa standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation
Trends in Argument Numbering • Arg0 = agent • Arg1 = direct object / theme / patient • Arg2 = indirect object / benefactive / instrument / attribute / end state • Arg3 = start point / benefactive / instrument / attribute • Arg4 = end point
Ergative/Unaccusative Verbs Roles (no ARG0 for unaccusative verbs) Arg1 = Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4 = end point Sales rose 4% to $3.28 billion from $3.16 billion. The Nasdaq composite index added 1.01 to 456.6 on paltry volume.
Function tags for English/Chinese (arguments or adjuncts?) • Variety of ArgM’s (Arg#>4): • TMP - when? • LOC - where at? • DIR - where to? • MNR - how? • PRP -why? • TPC – topic • PRD -this argument refers to or modifies another • ADV –others • CND – conditional • DGR – degree • FRQ - frequency
Inflection • Verbs also marked for tense/aspect • Passive/Active • Perfect/Progressive • Third singular (is has does was) • Present/Past/Future • Infinitives/Participles/Gerunds/Finites • Modals and negation marked as ArgMs
Word Senses in PropBank • Orders to ignore word sense not feasible for 700+ verbs • Mary left the room • Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0: entity leaving Arg1: place left Frameset leave.02 "give": Arg0: giver Arg1: thing given Arg2: beneficiary How do these relate to traditional word senses as in WordNet?
Overlap between Groups and Framesets – 95% Frameset2 Frameset1 WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN5 WN 9 WN10 WN11 WN12 WN13 WN 14 WN19 WN20 develop Palmer, Dang & Fellbaum, NLE 2004
English PropBank Status - (w/ Paul Kingsbury & Scott Cotton) • Create Frame File for that verb - DONE • 3282 lemmas, 4400+ framesets • First pass: Automatic tagging (Joseph Rosenzweig) • Second pass: Double blind hand correction • 118K predicates – all but 300 done • Third pass: Solomonization (adjudication) • Betsy Klipple, Olga Babko-Malaya – 400 left • Frameset tags • 700+, double blind, almost adjudicated, 92% ITA • Quality Control and general cleanup
Quality Control and General Cleanup • Frame File consistency checking • Coordination with NYU • Insuring compatibility of frames and format • Leftover tasks • have, be, become • Adjectival usages • General cleanup • Tense tagging • Finalizing treatment of split arguments, ex. say, and symmetric arguments, ex. match • Supplementing sparse data w/ Brown for selected verbs
Summary of English PropBankPaul Kingsbury, Olga Babko-Malaya, Scott Cotton
PropBank II • Nominalizations NYU • Lexical Frames DONE • Event Variables, (including temporals and locatives) • More fine-grained sense tagging • Tagging nominalizations w/ WordNet sense • Selected verbs and nouns • Nominal Coreference • not names • Clausal Discourse connectives – selected subset
sense tags; discourse connectives { } help2,5 tax rate1 keep1 company1 PropBank I I Also, [Arg0substantially lower Dutch corporate tax rates] helped [Arg1[Arg0 the company] keep [Arg1 its tax outlay] [Arg3-PRD flat] [ArgM-ADV relative to earnings growth]]. Event variables; nominal reference; REL Arg0 Arg1 Arg3-PRD ArgM-ADV help tax rates the company keep its tax outlay flat keep the company its tax outlay flat relative to earnings…
Summary of Multilingual TreeBanks, PropBanks * Also 1M word English monolingual PropBank
Agenda • PropBank I 10:30 – 10:50 • Automatic labeling of semantic roles • Chinese Proposition Bank • Proposition Bank II 10:50 – 11:30 • Event variables – Olga Babko Malaya • Sense tagging – Hoa Dang • Nominal coreference – Edward Loper • Discourse tagging – Aravind Joshi • Research Areas – 11:30 – 12:00 • Moving forward – Mitch Marcus • Alignment improvement via dependency structures– Yuan Ding • Employing syntactic features in MT – Libin Shen • Lunch 12:00 – 1:30 White Dog • Research Area - 1:30 – 1:45 • Clustering – Paul Kingsbury • DOD Program presentation – 1:45 – 2:15 • Discussion 2:15 – 3:00