Semantic Role Labeling of Implicit Arguments for Nominal Predicates

Language & Interaction Research Semantic Role Labelingof Implicit Arguments forNominal Predicates Matthew Steven Gerber Department of Computer Science Michigan State University East Lansing, Michigan, USA

The Semantic Role • The relationship between a noun phrase and a predicate • John threw a ball to Mary. • A ball was thrown to Mary by John. • … • John: the thrower • a ball: entity thrown • Mary: recipient • Semantically normalize syntactic variation semantic roles

Semantic Role Labeling (SRL) • Automatically identify semantic roles • Example • John shipped a package from Michigan to California. • Benefits downstream processes • Question answering (Pizzato and Molla, 2008) • Information extraction (Banko and Etzioni, 2008) • Machine translation (Liu and Gildea, 2010) [Sender John] [Predicate shipped] [Thingshippeda package] [Source from Michigan] [Destination to California].

Semantic Role Labeling (SRL) • 2002 – present: mostly focused on verbal SRL • Noun-based (nominal) SRL is also possible • Example • John made a shipment from Michigan to California. [Sender John] made a [Predicate shipment] [Source from Michigan] [Destination to California].

Arg0: distributor • Arg1: thing distributed • Arg2: recipient Nominal SRL Based on NomBank • Lexicon entry • Distribution: send a package • Annotation • Searle will give pharmacists brochures on the use of prescription drugs for distribution in their stores. • Automatic nominal SRL (Liu and Ng, 2007) • Supervised machine learning • Test over predicates known to take arguments Searle will give [Arg0 pharmacists] [Arg1 brochures on the use of prescription drugs] for [Predicate distribution] [Location in their stores].

Problems due to Implicit Arguments John threw a ball to Mary. The catch was amazing. • Evaluation methodology • Test predicates with local arguments • Not all predicates have local arguments • Does not reflect practical system performance • Implicit arguments • Not annotated by NomBank • Often exist in surrounding text • Can they be automatically recovered?

Research Contributions Investigate implicit arguments (IAs) in nominal SRL • Examine the effect of IAs on nominal SRL performance • Improve SRL performance by taking IAs into account • Recover implicit arguments from discourse

Nominal SRL Model • Full syntactic analysis S VP S VP VP NP NP PP NP JJ NNS Judge Curry ordered Edison to make average [Predicate refunds] of about $45.

Nominal SRL Model • Full syntactic analysis • 23-class classification problem over parse tree nodes S VP S VP VP NP NP PP NP JJ NNS Judge Curry ordered [Arg0 Edison] to make average [Predicate refunds] [Arg1 of about $45].

Nominal SRL Model • Features • Primarily lexical and syntactic • Most important feature: parse tree path

NNS VP S VP … Nominal SRL Model • NP S VP S VP VP NP NP PP NP JJ NNS Judge Curry ordered [ ? Edison] to make average [Predicate refunds] of about $45.

NNS:refund Nominal SRL Model • Other notable features • Parse path concatenated with predicate • First word of candidate argument • Greedy forward feature selection • 31 selected features • 2 numeric features • Other features are binarized VP S VP … • NP

Evaluation of Nominal SRL Model • Training data • Sections 2-21 of NomBank • 3.7 million training instances • Development data • Section 24 • Feature selection • Threshold tuning • Testing data • Section 23

Evaluation of Nominal SRL Model • Results • Previous best argument F1: 72.8% (Liu and Ng, 2007) • Our best argument F1: 75.7% (Gerber et al., 2009) • Key differences • New features (e.g., lexicalized parse path) • Single-stage classification pipeline • Problem: ground-truth predicates • In a realistic setting, we won’t have this information

Impact of Implicit Arguments • Extended evaluation • Process each token in test data • Attempt SRL for known nouns • Results

Impact of Implicit Arguments • Example error • Canadian investment rules require that big foreign takeovers meet that standard. • Compare to • Canadian investment in the US has declined. [Arg0 Canadian] [Predicate investment] rules require that big foreign takeovers meet that standard. All arguments are implicit [Arg0 Canadian] [Predicate investment] [Arg1 in the US] has declined.

Summary • Nominal SRL system for unstructured text • Supervised argument identification • Works for all 4,700+ distinct NomBank predicates • Argument F1: 75.7% • IAs pose a serious problem for nominal SRL • Performance loss of 8.5% • False positive argument predictions

Prevalence of Implicit Arguments • Percentage of predicates with local arguments: 57%

Nominal Classification Model • Filter out predicates whose arguments are implicit • Binary classification over token nodes • Positive class: apply standard SRL system • Negative class: ignore

Nominal Classification Model • Features • Primarily syntactic and lexical • Most important feature: ancestor grammar rules S AGR3: S -> NP, VP NP AGR2: VP -> V, NP VP N (John) V (made) NP AGR1: NP -> Det, N Det (a) N (sale)

Nominal Classification Model • Features • Primarily syntactic and lexical • Most important feature: ancestor grammar rules • Other notable features • Candidate token stem • Syntactic category of token’s right sibling • … • Quite different from the SRL argument features

Nominal Classification Evaluation • Training data • Tokens from sections 2-21 of TreeBank • 780,000 training instances • Development data • Section 24 • Feature selection • Threshold tuning • Testing data • Section 23

Nominal Classification Evaluation • Baseline systems • Naïve • All known nouns should be labeled for arguments • MLE • MLE scoring of nouns and a tuned prediction threshold • Results

Nominal Classification Evaluation • Results by nominal distribution

Combined Nominal-Argument Evaluation • Task: identify predicate, then identify arguments Previous best score using ground-truthpredicates: 72.8

Combined Nominal-Argument Evaluation • Results by nominal distribution

Summary • IAs pose a serious problem for nominal SRL • Performance loss of 8.5% • Filter nominal predicates whose arguments are implicit • Predicate F1: 88% • End-to-end argument F1: 71% • Problem: Implicit arguments are not recovered!

A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us?

A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced?

A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer?

A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer? • Who ships?

A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer? • Who ships what?

A Motivating Example Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. • What can traditional SRL systems tell us? • Who is the producer? • What is produced? • What is manufactured? • But that’s not the whole story… • Who is the manufacturer? • Who ships what to whom? Implicit arguments

Implicit Argument Identification • Research questions • Where are implicit arguments? • Can we recover them? • Related work • Japanese anaphora • Indirect anaphora (Sasano et al., 2004) • Zero-anaphora (Imamura et al., 2009) • Implicit arguments • Fine-grained domain model (Palmer et al., 1986) • SemEval Task 10 (Ruppenhofer et al., 2010)

Data Annotation • Ten most prominent NomBank predicates • Derived from verbal role set (ship shipment) • Frequency of nominal predicate • Difference between verbal/nominal argument counts • [John] shipped [the package]. (argument count: 2) • Shipping costs will decline. (argument count: 0) • Predicates instances annotated: 1,247 • Independently annotated by two annotators • Cohen’s Kappa: 64% • Agreement: both unfilled or both filled identically

Annotation Analysis Average number of expressed arguments Post-annotation Verb form Pre-annotation Verb form

Annotation Analysis Average number of expressed arguments Pre-annotation Verb form Post-annotation Verb form

Annotation Analysis • Average arguments across all predicates • Pre-annotation: 1.1 • Post-annotation: 1.9 • Verb form: 2.0 • Overall percentage of possible roles filled • Pre-annotation: 28.0% • Post-annotation: 47.8% ( 71%)

Annotation Analysis 90% within current or previous three sentences Only 55% within current sentence

Model Formulation c1 c2 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3 • Candidate selection • PropBank/NomBank arguments • Two-sentence candidate window • Coreference chaining • Binary classification function

Model Features SRL structure … … Discourse structure Other

Model Features: SRL Structure • VerbNet role transition Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3

Model Features: SRL Structure • VerbNet role transition Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3 arg1 arg1? create VN role: product product VN class: create VN class: send send item .item • Feature value: create.product send.item • Multiple values are possible

Model Features: SRL Structure • Narrative event chains (Chambers and Jurafsky, 2008) • PMI(manufacture.arg1, ship.arg1) • Computed from SRL output over Gigaword (Graff, 2003) • Advantages: better coverage + relationship strength Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. c3 arg1

Model Features: Discourse Structure • Penn Discourse TreeBank (Prasad et al., 2008) • Feature value: Contingency.Cause.Result • Might help identify salient discourse segments c2 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

Semantic Role Labeling of Implicit Arguments for Nominal Predicates

Semantic Role Labeling of Implicit Arguments for Nominal Predicates

Presentation Transcript

CS 388: Natural Language Processing: Semantic Role Labeling

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS

Semantic Role Labeling

A Joint Model of Implicit Arguments for Nominal Predicates

On Labeling Schemes for the Semantic Web

Starting from Scratch in Semantic Role Labeling

Semantic Role Labeling

Automatic Semantic Role Labeling

Generalized Inference with Multiple Semantic Role Labeling Systems

A Memory-Based Approach to Semantic Role Labeling

Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Class-based nominal semantic role labeling: a preliminary investigation

The Role of Implicit Argumentation in Nominal SRL

DEPENDENCY PARSING ， Framenet , SEMANTIC ROLE LABELING, SEMANTIC PARSING

SPred : Large-scale Harvesting of Semantic Predicates

Semantic Role Labeling with support vector machines

Forest-based Semantic Role Labeling

Automatic Labeling of Semantic Roles

Role of a Nominal Anchor

CS 388: Natural Language Processing: Semantic Role Labeling

Semantic Role Labeling on Nouns

Robust Semantic Role Labeling for Nominals