SRL via Generalized Inference

SRL via Generalized Inference Vasin Punyakanok, Dan Roth, Wen-tau Yih, Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois at Urbana-Champaign

Semantic Role Labeling • For each verb in a sentence • identify all constituents that fill a semantic role • determine their roles • Agent, Patient or Instrument, … • Their adjuncts, e.g., Locative, Temporal or Manner • PropBank project [Kingsbury & Palmer02] provides a large human-annotated corpus of semantic verb-argument relations. • CoNLL-2004 shared task [Carreras & Marquez 04]

Example • A0 represents the leaver, • A1 represents the thing left, • A2 represents the benefactor, • AM-LOC is an adjunct indicating the location of the action, • V determines the verb.

Argument Types • A0-A5 and AA have different semantics for each verb as specified in the PropBank Frame files. • 13 types of adjuncts labeled as AM-XXX where XXX specifies the adjunct type. • C-XXX is used to specify the continuity of the argument XXX. • In some cases, the actual agent is labeled as the appropriate argument type, XXX, while the relative pronoun is instead labeled as R-XXX.

Examples • C-XXX • R-XXX

Outline • Find potential argument candidates • Classify arguments to types • Inference for Argument Structure • Cost Function • Constraints • Integer linear programming (ILP) • Results & Discussion

I left my nice pearls to her I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] Find Potential Arguments • An argument can be any consecutive words • Restrict potential arguments • BEGIN(word) • BEGIN(word) = 1  “word begins argument” • END(word) • END(word) = 1  “word ends argument” • Argument • (wi,...,wj) is a potential argument iff • BEGIN(wi) = 1 and END(wj) = 1 • Reduce set of potential arguments

Details – Word-level Classifier • BEGIN(word) • Learn a function • B(word,context,structure)  {0,1} • END(word) • Learn a function • E(word,context,structure)  {0,1} • POTARG = {arg | BEGIN(first(arg)) andEND(last(arg))}

I left my nice pearls to her I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] Arguments Type Likelihood • Assign type-likelihood • How likely is it that arg a is type t? • For all aPOTARG , tT • P (argument a = type t) 0.30.20.20.3 0.60.00.00.4 A0 C-A1 A1 Ø

Details – Phrase-level Classifier • Learn a classifier • ARGTYPE(arg) • P(arg)  {A0,A1,...,C-A0,...,AM-LOC,...} • argmaxt{A0,A1,...,C-A0,...,LOC,...}wtP(arg) • Estimate Probabilities • Softmax • P(a = t) = exp(wtP(a)) / Z

What is a Good Assignment? • Likelihood of being correct • P(Arg a = Type t) • if t is the correct type for argument a • For a set of arguments a1, a2, ..., an • Expected number of arguments that are correct • i P( ai = ti ) • We search for the assignment with the maximum expected number of correct arguments.

0.30.20.20.3 0.60.00.00.4 0.10.30.50.1 0.10.20.30.4 Cost = 0.3 + 0.4 + 0.3 + 0.4 = 1.4 BlueRed & N-O Cost = 0.3 + 0.4 + 0.5 + 0.4 = 1.6 Non-Overlapping Cost = 0.3 + 0.6 + 0.5 + 0.4 = 1.8 Independent Max Inference • Maximize expected number correct • T* = argmaxT i P( ai = ti ) • Subject to some constraints • Structural and Linguistic (R-A1A1) I left my nice pearls to her Ileftmy nice pearlsto her

LP Formulation – Linear Cost • Cost function • aPOTARG P(a=t) = aPOTARG , tT P(a=t) x{a=t} • Indicator variables x{a1=A0}, x{a1= A1}, …, x{a4= AM-LOC}, x{P4=}  {0,1} Total Cost = p(a1= A0)·x(a1= A1) + p(a1= )·x(a1= ) +… + p(a4= )·x(a4= )

Linear Constraints (1/2) • Binary values  aPOTARG,tT , x{a= t} {0,1} • Unique labels  aPOTARG ,  tTx{a= t}= 1 • No overlapping or embedding a1 and a2overlap  x{a1=Ø} + x{a2=Ø} 1

Linear Constraints (2/2) • No duplicate argument classes aPOTARG x{a= A0} 1 • R-XXX  a2POTARG , aPOTARG x{a= A0}x{a2= R-A0} • C-XXX  a2POTARG , (aPOTARG)  (a is before a2 )x{a= A0}x{a2= C-A0}

Results on Perfect Boundaries Assume the boundaries of arguments (in both training and testing) are given. Development Set

Results • Overall F1 on Test Set : 66.39

Discussion • Data analysis is important !! • F1: ~45%  ~65% • Feature engineering, parameter tuning, … • Global inference helps ! • Using all constraints gains more than 1% F1 compared to just using non-overlapping constraints • Easy and fast: 15~20 minutes • Performance difference ? • Not from word-based vs. chunk-based

Thank you yih@uiuc.edu

SRL via Generalized Inference

SRL via Generalized Inference

Presentation Transcript

MultiMedia SRL

Nonlinear Data Discrimination via Generalized Support Vector Machines

MUSSARI SRL

Multiconn Srl

SRL Slides

CDNs Content Outsourcing via Generalized Communities

SRL Website

Generalized Inference with Multiple Semantic Role Labeling Systems

SOFTALP srl

C.L.A. srl

Cybion Srl

Robust Textual Inference via Graph Matching

Inference and Learning via Integer Linear Programming

SRL Diagnostic Aerocity | SRL Diagnostic Mahipalpur