190 likes | 278 Views
SRL via Generalized Inference. Vasin Punyakanok, Dan Roth, Wen-tau Yih , Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois at Urbana-Champaign. Semantic Role Labeling. For each verb in a sentence identify all constituents that fill a semantic role
E N D
SRL via Generalized Inference Vasin Punyakanok, Dan Roth, Wen-tau Yih, Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois at Urbana-Champaign
Semantic Role Labeling • For each verb in a sentence • identify all constituents that fill a semantic role • determine their roles • Agent, Patient or Instrument, … • Their adjuncts, e.g., Locative, Temporal or Manner • PropBank project [Kingsbury & Palmer02] provides a large human-annotated corpus of semantic verb-argument relations. • CoNLL-2004 shared task [Carreras & Marquez 04]
Example • A0 represents the leaver, • A1 represents the thing left, • A2 represents the benefactor, • AM-LOC is an adjunct indicating the location of the action, • V determines the verb.
Argument Types • A0-A5 and AA have different semantics for each verb as specified in the PropBank Frame files. • 13 types of adjuncts labeled as AM-XXX where XXX specifies the adjunct type. • C-XXX is used to specify the continuity of the argument XXX. • In some cases, the actual agent is labeled as the appropriate argument type, XXX, while the relative pronoun is instead labeled as R-XXX.
Examples • C-XXX • R-XXX
Outline • Find potential argument candidates • Classify arguments to types • Inference for Argument Structure • Cost Function • Constraints • Integer linear programming (ILP) • Results & Discussion
I left my nice pearls to her I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] Find Potential Arguments • An argument can be any consecutive words • Restrict potential arguments • BEGIN(word) • BEGIN(word) = 1 “word begins argument” • END(word) • END(word) = 1 “word ends argument” • Argument • (wi,...,wj) is a potential argument iff • BEGIN(wi) = 1 and END(wj) = 1 • Reduce set of potential arguments
Details – Word-level Classifier • BEGIN(word) • Learn a function • B(word,context,structure) {0,1} • END(word) • Learn a function • E(word,context,structure) {0,1} • POTARG = {arg | BEGIN(first(arg)) andEND(last(arg))}
I left my nice pearls to her I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] Arguments Type Likelihood • Assign type-likelihood • How likely is it that arg a is type t? • For all aPOTARG , tT • P (argument a = type t) 0.30.20.20.3 0.60.00.00.4 A0 C-A1 A1 Ø
Details – Phrase-level Classifier • Learn a classifier • ARGTYPE(arg) • P(arg) {A0,A1,...,C-A0,...,AM-LOC,...} • argmaxt{A0,A1,...,C-A0,...,LOC,...}wtP(arg) • Estimate Probabilities • Softmax • P(a = t) = exp(wtP(a)) / Z
What is a Good Assignment? • Likelihood of being correct • P(Arg a = Type t) • if t is the correct type for argument a • For a set of arguments a1, a2, ..., an • Expected number of arguments that are correct • i P( ai = ti ) • We search for the assignment with the maximum expected number of correct arguments.
0.30.20.20.3 0.60.00.00.4 0.10.30.50.1 0.10.20.30.4 Cost = 0.3 + 0.4 + 0.3 + 0.4 = 1.4 BlueRed & N-O Cost = 0.3 + 0.4 + 0.5 + 0.4 = 1.6 Non-Overlapping Cost = 0.3 + 0.6 + 0.5 + 0.4 = 1.8 Independent Max Inference • Maximize expected number correct • T* = argmaxT i P( ai = ti ) • Subject to some constraints • Structural and Linguistic (R-A1A1) I left my nice pearls to her Ileftmy nice pearlsto her
LP Formulation – Linear Cost • Cost function • aPOTARG P(a=t) = aPOTARG , tT P(a=t) x{a=t} • Indicator variables x{a1=A0}, x{a1= A1}, …, x{a4= AM-LOC}, x{P4=} {0,1} Total Cost = p(a1= A0)·x(a1= A1) + p(a1= )·x(a1= ) +… + p(a4= )·x(a4= )
Linear Constraints (1/2) • Binary values aPOTARG,tT , x{a= t} {0,1} • Unique labels aPOTARG , tTx{a= t}= 1 • No overlapping or embedding a1 and a2overlap x{a1=Ø} + x{a2=Ø} 1
Linear Constraints (2/2) • No duplicate argument classes aPOTARG x{a= A0} 1 • R-XXX a2POTARG , aPOTARG x{a= A0}x{a2= R-A0} • C-XXX a2POTARG , (aPOTARG) (a is before a2 )x{a= A0}x{a2= C-A0}
Results on Perfect Boundaries Assume the boundaries of arguments (in both training and testing) are given. Development Set
Results • Overall F1 on Test Set : 66.39
Discussion • Data analysis is important !! • F1: ~45% ~65% • Feature engineering, parameter tuning, … • Global inference helps ! • Using all constraints gains more than 1% F1 compared to just using non-overlapping constraints • Easy and fast: 15~20 minutes • Performance difference ? • Not from word-based vs. chunk-based
Thank you yih@uiuc.edu