1 / 17

English Proposition Bank: Status Report

English Proposition Bank: Status Report. Olga Babko-Malaya, Paul Kingsbury, Scott Cotton, Martha Palmer, Mitch Marcus March 25, 2003. Outline. Overview Status Report Mapping of Propbank Framesets to other sense distinctions. Example.

Download Presentation

English Proposition Bank: Status Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. English Proposition Bank:Status Report • Olga Babko-Malaya, Paul Kingsbury, Scott Cotton, Martha Palmer, Mitch Marcus • March 25, 2003

  2. Outline • Overview • Status Report • Mapping of Propbank Framesets to other sense distinctions

  3. Example • He sent merchants around the country a form asking them to check one of three answers. Arg0: He REL: sent Arg2 : merchants around the country Arg1: a form asking them to check one of three answers.

  4. Predicate-argument structure send Agent: HeGoal: merchants Theme: form NP1 NP2NP2 He sent merchants around the country a form asking them to check one of three answers.

  5. Used At • MITRE, Xerox Parc, Sheffield University, BBN, Syracuse University, IBM, NYU, SRA, CMU, MIT, University of Texas at Dallas, University of Toronto, Columbia University, SPAWAR, and the JHU summer workshop. Also to JK Davis, John Josef Costandi, and Steve Maiorano. • Improvements in IE reported in ACL’03 Submission

  6. Annotation procedure • Extraction of all sentences with given verb • First pass: Automatic tagging (Joseph Rosenzweig) http://www.cis.upenn.edu/~josephr/TIDES/index.html#lexicon • Second pass: Double blind hand annotation • Third pass: adjudication Tagging tool highlights inconsistencies

  7. Projected delivery dates • Financial subcorpus • alpha release: December, 2001--DONE! • beta release: July, 2002--DONE! • adjudicated release: summer 2003 • Propbank corpus • beta release: Summer 2003 • adjudicated release: December 2003

  8. English PropBank - Current Status • 3183 frame files, corresponding to 3625 distinct predicates (including phrasal variants) - finished! • At least single annotated: 2915 verbs, 94.5K instances (80% of the TreeBank) • At least double annotated: 2250 verbs, 60K instances (67% of the Treebank) • Adjudicated: 1032 verbs, 25K instances (20% of the Treebank) • Coordinating with NYU on nominalizations – using Penn tagger and Frames files

  9. Word Sense in Propbank • Original plan to ignore Word sense not feasible for 700+ verbs • Mary left the room • Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0:entity leaving Arg1:place left Frameset leave.02 "give": Arg0:giver Arg1:thing given Arg2:beneficiary How do these relate to traditional word senses as in WordNet?

  10. Fine-grained WordNet Senses • Senseval 2 – WSD Bakeoff, usingWordNet 1.7 • Verb ‘Develop’ WN1: CREATE, MAKE SOMETHING NEW They developed a new technique WN2: CREATE BY MENTAL ACT They developed a new theory of evolution develop a better way to introduce crystallography techniques

  11. WN Senses: verb ‘develop’ WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN5 WN 9 WN10 WN11 WN12 WN13 WN 14 WN19 WN20

  12. Sense Groups: verb ‘develop’ WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN5 WN 9 WN10 WN11 WN12 WN13 WN 14 WN19 WN20

  13. Propbank Framesets for verb ‘develop’ Frameset 1 (sense: create/improve) Arg0: agent Arg1: thing developed Example: They developed a new technique Frameset 2 (sense: come about) Arg1: non-intentional theme Example: The plot develops slowly

  14. Mapping between Groups and Framesets Frameset2 Frameset1 WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN5 WN 9 WN10 WN11 WN12 WN13 WN 14 WN19 WN20

  15. Sense Hierarchy • Framesets – coarse grained distinctions • Sense Groups (Senseval-2) intermediate level (includes Levin classes) – 95% overlap • WordNet – fine grained distinctions

  16. Sense-Tagging of Propbank • Sense tagging is primarily confined to the financial subcorpus, consists of about 90% of the polysemous instances in that corpus, and spans 415 verbs. • single tagged 12k polysemous instances with roleset identifiers. • double tagged 3k polysemous instances. • 94% agreement between annotators

  17. Training Automatic Taggers • Stochastic tagger (Dan Gildea) • Results: Gold Standard parses 73.5 P, 71.7 R Automatic parses 59.0 P, 55.4 R • New results • Using argument labels as features for WSD • EM clustering for assigning argument labels

More Related