210 likes | 324 Views
A Structured Model for Joint Learning of Argument Roles and Predicate Senses. Yotaro Watanabe Masayuki Asahara Yuji Matsumoto. Tohoku University Nara Institute of Science and Technology. ACL 2010 Uppsala, Sweden July 12, 2010.
E N D
A Structured Model for Joint Learning of Argument Roles and Predicate Senses Yotaro Watanabe Masayuki Asahara Yuji Matsumoto Tohoku University Nara Institute of Science and Technology ACL 2010 Uppsala, Sweden July 12, 2010
Predicate-Argument Structure Analysis(Semantic Role Labeling) • Task of analyzing predicates and its arguments • A predicate represents a state or an event, and its arguments have relations to the predicate • Each of arguments has a particular semantic role (Agent, Theme, etc) • In recent years, predicate sense disambiguation has been included in predicate-argument structure analysis [Surdeanu+ 08, Hajič+ 09] • ‘sell.01’ means that ‘sold’ is an instance of the first sense of ‘sell’ • Important for many NLP applications • MT, QA, RTE, etc. The luxury auto maker last year sold 1,214 cars in the U.S. Temporal Product Agent Theme Agent Location maker.01 sell.01
Two Types of Dependencies of Elements in Predicate-Argument Structures • Inter-dependencies between a predicate and its arguments • A1: car => we can infer that the correct sense is drive.01 • Non-local dependencies among arguments • Two or more arguments do not have the same role • Basically, obligatory roles of the predicate should appear in sentences OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his A0 A1 drive.01 In order to realize robust predicate-argument structure analysis, it is necessary to deal with these types of dependencies
Previous Work • Non-local dependencies among arguments: Re-ranking [Johansson and Nugues 2008, etc.] • Generate N-best assignments of argument roles, then obtain global features for each assignment, finally select the argmax using the re-ranker • Can not explicitly capture inter-dependencies between a predicate and its arguments • Inter-dependencies between a predicate and its arguments: Markov Logic Networks [Meza-Ruiz and Riedel 2009, etc.] • Jointly learn and classify pred. senses and arg. roles simultaneously • MLN can not deal with particular types of global features Currently, no existing (discriminative) approach sufficiently handles both types of dependencies
Previous Work • Non-local dependencies among arguments: Re-ranking [Johansson and Nugues 2008, etc.] • Generate N-best assignments of argument roles, then obtain global features for each assignment, finally select the argmax using the re-ranker • Can not explicitly capture inter-dependencies between a predicate and its arguments • Inter-dependencies between a predicate and its arguments: Markov Logic Networks [Meza-Ruiz and Riedel 2009, etc.] • Jointly learn and classify pred. senses and arg. roles simultaneously • MLN can not deal with particular types of global features Currently, no existing (discriminative) approach sufficiently handles both types of dependencies We propose a structured model that can capture both types of dependencies simultaneously
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his Expand the possible labels of predicate senses and argument roles drove drive.01 drive.02 A0 A1 A0 A1 … NONE NONE … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his Expand the possible labels of predicate senses and argument roles drove We use four types of factors which score labels of elements in predicate-argument structures drive.01 drive.02 A0 A1 A0 A1 … NONE NONE … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his Expand the possible labels of predicate senses and argument roles drove We use four types of factors which score labels of elements in predicate-argument structures drive.01 drive.02 These factors are defined by (linear model) A0 A1 A0 A1 … NONE NONE … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his drove use a factor which scores sense labels of the predicate 1.4754 0.7268 FP drive.01 drive.02 A0 A1 A0 A1 … NONE NONE … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his drove use a factor which scores role labels of each argument FP drive.01 drive.02 A0 A1 A0 A1 … NONE NONE FA … 0.238 0.876 Paul car 1.784 -1.665 -1.235 -1.482
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his add a factor which scores label pairs of a predicate sense and a semantic role of an argument drove FP drive.01 drive.02 FPA 0.261 0.764 A0 A1 A0 A1 … NONE NONE FA … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his add a factor which captures plausibility of the whole predicate-argument structure (use global features) FG drove A0,drive01,A1 … 1.865 FP drive.01 drive.02 FPA A0 A1 A0 A1 … NONE NONE FA … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his add a factor which captures plausibility of the whole predicate-argument structure (use global features) FG drove A0,drive01,A1 … 1.865 FP drive.01 drive.02 FPA The predicate ‘drive’ has all obligatory roles A0 and A1 => FG assigns the higher score to the weight corresponds to this feature A0 A1 A0 A1 … NONE NONE FA … Paul car
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his The proposed model combines these types of factors FG drove A0,drive01,A1 … 1.4754 0.7268 1.865 FP drive.01 drive.02 FPA 0.425 0.764 A0 A1 A0 A1 … NONE NONE FA … 0.238 0.876 Paul car 1.784 -1.665 -1.235 -1.482
The proposed model OBJ drive.01: drive a vehicle A0: driver A1: vehicle drive.02: cause to move A0: driver A1: things in motion NMOD SBJ car Paul drove his The proposed model combines these types of factors A0 A1 drive.01 FG The highest scoring assignment is returned by the proposed model drove A0,drive01,A1 … 1.4754 0.7268 1.865 FP drive.01 drive.02 FPA 0.425 0.764 A0 A1 A0 A1 … NONE NONE FA … 0.238 0.876 Paul car 1.784 -1.665 -1.235 -1.482
Dealing with global (non-local) features • Introduce the fundamental idea of [Kazama and Torisawa 2007] • Features are divided into local features and global features • Inference: N-best based approach (1) Generate N-best assignments using only local features (2) Obtain global features in the N-best assignments (3) Select the argmax • Learning: train parameters with two margin constraints • All: train parameters so as to ensure a sufficient margin using all features (both local features and global features) • Local only: when the constraint All is satisfied, train parameters so as to ensure a sufficient margin using only local features • K&T proposed a Margin-Perceptron Learning Algorithm
Inference and Learning Algorithm of the Proposed Model • Inference: generate N-best assignments for each predicate sense • Learning: the online Passive-Aggressive Algorithm [Crammer 2006] • The parameters are trained by solving the optimization problem used in PA with the two margin constraints: All (local + global) and Local only (2) Local only (1) All (local + global) positive positive other other margin margin
Results on the CoNLL-2009 ST Dataset (average) • The best performance is obtained by using the all factors • Our model achieved the competitive results with the top system in the CoNLL-2009 Shared Task without any feature selection procedure FG FP sense FPA … roleN role1 role2 FA
Results on the CoNLL-2009 ST Dataset (average) • By adding two types of factors FPA and FG, we obtained performance improvements in both tasks (predicate sense disambiguation and argument role labeling) => Succeeded in joint learning FG FP sense FPA … roleN role1 role2 FA
Summary • We proposed a structured model that can capture two types of dependencies • Non-local dependencies among arguments • Inter-dependencies between a predicate and its arguments • The proposed model achieved the competitive results with the state-of-the-art SRL systems without any feature selection procedure • By adding two types of factors, we obtained performance improvements on both predicate sense disambiguation and argument role labeling => succeeded in joint learning • Future Work • exploiting unlabeled data (unsupervised or semi-supervised predicate-argument structure analysis)