1 / 32

Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates. Matthew Gerber and Joyce Y. Chai. Language & Interaction Research. Department of Computer Science Michigan State University East Lansing, Michigan, USA. A Motivating Example.

brandi
Download Presentation

Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beyond NomBank:A Study of Implicit Arguments for Nominal Predicates Matthew Gerber and Joyce Y. Chai Language & Interaction Research Department of Computer Science Michigan State University East Lansing, Michigan, USA

  2. A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us?

  3. A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced?

  4. A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer?

  5. A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer? • Who ships?

  6. A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer? • Who ships what?

  7. A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer? • Who ships what to whom? Implicit arguments

  8. Nominal SRL ...saving shipping costs. • Nominal predicates are not new • NomBank (Meyers, 2007) • 115k manual SRL analyses for 4,700 predicates • Identifying NomBank arguments • Jiang and Ng (2006); Liu and Ng (2007) • Many predicates lack arguments in NomBank • 2008 CoNLL Shared Task (Surdeanu et al., 2008) • Gerber et al. (NAACL, 2009) • Predicate filter improves performance • Do not address the recovery of implicit arguments

  9. Implicit Argument Identification • Research questions • Where are implicit arguments? • Can we recover them? • Related work • Japanese anaphora • Indirect anaphora (Sasano et al., 2004) • Zero-anaphora (Imamura et al., 2009) • Implicit Arguments • Fine-grained domain model (Palmer et al., 1986) • SemEval Task 10 (Ruppenhofer et al., 2010)

  10. Outline • Implicit Argument Annotation and Analysis • Model Formulation and Features • Evaluation • Conclusions and Future Work

  11. Data Annotation • Ten most prominent NomBank predicates • Derived from verbal role set (ship shipment) • Frequency of nominal predicate • Difference between verbal/nominal argument counts • [John] shipped [the package]. (argument count: 2) • Shipping costs will decline. (argument count: 0) • Predicates instances annotated: 1,254 • Independently annotated by two annotators • Cohen’s Kappa: 67% • Agreement: both unfilled or both filled identically

  12. Annotation Analysis Average number of expressed arguments Post-annotation Verb form Pre-annotation Verb form

  13. Annotation Analysis Average number of expressed arguments Pre-annotation Verb form Post-annotation Verb form

  14. Annotation Analysis • Average arguments across all predicates • Pre-annotation: 1.1 • Post-annotation: 1.8 • Verb form: 2.0 • Overall percentage of possible roles filled • Pre-annotation: 28.0% • Post-annotation: 46.2% ( 65%)

  15. Annotation Analysis 90% within current or previous three sentences Only 55% within current sentence

  16. Outline • Implicit Argument Annotation and Analysis • Model Formulation and Features • Evaluation • Conclusions and Future Work

  17. Model Formulation c1 c2 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3 • Candidate selection • Core PropBank/NomBank arguments • Two-sentence candidate window • Coreference chaining • Binary classification function

  18. Model Features SRL structure Discourse structure Other

  19. Model Features: SRL Structure • VerbNet role transition Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3

  20. arg1 Model Features: SRL Structure • VerbNet role transition Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3 arg1? create VN role: product product VN class: create VN class: send send.theme .theme • Feature value: create.product send.theme • Captures script-like properties of events • Multiple values are possible

  21. arg1 Model Features: SRL Structure • Narrative event chains (Chambers and Jurafsky, 2008) • PMI(manufacture.arg1, ship.arg1) • Computed from SRL output over Gigaword (Graff, 2003) • Advantages: better coverage + relationship strength Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3

  22. Model Features: Discourse Structure • Penn Discourse TreeBank (Prasad et al., 2008) • Feature value: Contingency.Cause.Result • Might help identify salient discourse segments c2 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

  23. Outline • Implicit Argument Annotation and Analysis • Model Formulation and Features • Evaluation • Conclusions and Future Work

  24. Evaluation Setting • Data processing • Gold SRL labels, OpenNLP coreference, GigaWord • Training (sections 2-21, 24) • 816 annotated predicates • 650 implicitly filled argument positions • LibLinear logistic regression • Testing (section 23) • 437 annotated predicates • 246 implicitly filled argument positions • Baseline heuristic: matching argument positions Armstrong agreed to sell its carpet operations to Shaw Industries. The sale could help Armstrong.

  25. Evaluation Setting • Data processing • Gold SRL labels, OpenNLP coreference, GigaWord • Training (sections 2-21, 24) • 816 annotated predicates • 650 implicitly filled argument positions • LibLinear logistic regression • Testing (section 23) • 437 annotated predicates • 246 implicitly filled argument positions • Baseline heuristic: matching argument positions Armstrong agreed to sell its carpet operations to Shaw Industries. The sale could help Armstrong.

  26. Evaluation Setting • Methodology (Ruppenhofer et al., 2010) • Ground-truth implicit arguments: • Predicted implicit argument: • Prediction score: • P: total prediction score / prediction count • R: total prediction score / true implicit positions Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

  27. Evaluation Results • Overall F1 • Baseline: 26.5% • Discriminative: 42.3% • Human annotator • Two-sentence window: 58.4% • Unlimited window: 67.0%

  28. Evaluation Results

  29. Feature Ablation • Ablation sets • SRL structure (e.g., VerbNet role transition) • Non-SRL information • Discourse structure

  30. Improvements Versus Baseline Olivetti has denied that it violated the rules, asserting that the shipments were properly licensed. However, the legality of these sales is still an open question. Olivetti supplies... • Who is the seller? • Two key pieces of information • Coreference chain for Olivetti (exporting/supplying) • Relationships between exporting, supplying, and sales Olivetti exports...

  31. Outline • Implicit Argument Annotation and Analysis • Model Formulation and Features • Evaluation • Conclusions and Future Work

  32. Conclusions and Future Work • Implicit arguments are prevalent • Add 65% to the coverage of NomBank • Most implicit arguments are near the predicate • 55% in current sentence • 90% within three sentences • Implicit arguments can be automatically extracted • SRL structure is currently the most informative • This is a difficult task and much work remains • Ongoing investigations • Global inference instead of local classification • Unsupervised knowledge acquisition • Data: http://www.cse.msu.edu/~gerberm2/implicit.html

More Related