180 likes | 411 Views
SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS. Kadri Hacioglu 1 , Sameer Pradhan 1 , Wayne Ward 1 James H. Martin 1 , Daniel Jurafsky 2. 2 Stanford NLP Group Stanford University. 1 The Center for Spoken Language Research. University of Colorado at Boulder. OUTLINE.
E N D
SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS Kadri Hacioglu1, Sameer Pradhan1, Wayne Ward1 James H. Martin1, Daniel Jurafsky2 2 Stanford NLP Group Stanford University 1 The Center for Spoken Language Research University of Colorado at Boulder
OUTLINE • Semantic Role Labeling (SRL) • Nature of Shared Task Data • Our Strategy • System Description & Features • Experiments • Concluding Remarks
Predicate: pursue A0 (Agent) A1 (Theme) AM-MNR (Manner) we completion of this transaction aggressively SEMANTIC ROLE LABELING • Based on predicate-argument structure: • First explored by (Gildea & Jurafsky, 2000) PropBank style Thematic role [A0We] are prepared to [PREDpursue] [A1aggressively] [AM-MNRcompletion of this transaction] he says
EXAMPLE OF SHARED TASK DATA words POS tags Clause tags Semantic labels Sales NNS B-NP (S* - (A1*A1) declined VBD B-VP * decline (V*V) 10 CD B-NP * - (A2* % NN I-NP * - *A2) to TO B-PP * - * $ $ B-NP * - (A4* 251.2 CD I-NP * - * million CD I-NP * - *A4) from IN B-PP * - * $ $ B-NP * - (A3* 287.7 CD I-NP * - * million CD I-NP * - *A3) . . O *S) - * Predicate Info BP tags (BOI2)
OUTLINE OF OUR STRATEGY • Change Shared Task Representation • make sure that it is reversible • Engineer additional features • use intuition, experience and data analysis • Optimize system settings • context size • SVM parameters; degree of polynomial, C
CHANGE IN REPRESENTATION • Restructure available information • - words collapsed into respective BPs • - only headwords are retained (rightmost words) • - exceptions: VPs with the predicate; Outside (O) chunks • Modify semantic role labeling • - BOI2 scheme instead of bracketing scheme
NEW REPRESENTATION BPs POS tags Clause tags Semantic labels (BOI2) NP Sales NNS B-NP (S* - B-A1 VP declined VBD B-VP * decline B-V NP % NN I-NP * - B-A2 PP to TO B-PP * - O NP million CD I-NP * - B-A4 PP from IN B-PP * - O NP million CD I-NP * - B-A3 O . . O *S) - O Predicate Info Headwords BP tags (BOI2)
DIFFERENCES BETWEEN REPRESENTATIONS
SYSTEM DESCRIPTION • Phrase-by-phrase • Left-to-right • Binary feature encoding • Discriminative • Deterministic • SVM based (YamCha toolkit, developed by Taku Kudo) • Simple post-processing (for consistent bracketing)
BASE FEATURES • Words • Predicate lemmas • Part of speech tags • Base phrase IOB2 tags • Clause bracketing tags • Named Entities
ADDITIONAL FEATURES Sentence level Token level Token position Path Clause bracket patterns Clause Position Headword suffixes Distance Length Predicate POS tag Predicate Frequency Predicate Context (POS, BP) Predicate Argument Frames Number of predicates
EXPERIMENTAL SET-UP • Corpus: Flattened PropBank (2004 release) • Training set: Sections 15-18 • Dev set: Section 20 • Test set: Section 21 • SVMs: 78 OVA classes, polynomial kernel, d=2, C=0.01 • Context: sliding +2/-2 tokens window
Method Data Precision Precision Recall Recall F1 F1 Dev set W-by-W 74.17% 68.34% 69.42% 45.16% 54.39 71.72 P-by-P Test set 69.04% 72.43% 66.77% 54.68% 61.02 69.49 RESULTS Base features, W-by-W & P-by-P approaches, dev set All features, P-by-P approach
CONCLUSIONS • We have done SRL by tagging base phrase chunks • - original representation has been changed • - additional features have been engineered • - SVMs have been used • Improved performance with new representation and • additional features • Compared to W-by-W approach, our method • - classifies larger units • - uses wider context • - runs faster • - performs better
THANK YOU! Boring! So so… Cool! Wow!… That’s OK!… Awesome! Not too bad! Yawning..
CLAUSE FEATURES Clause (CL) markers CL pattern to predicate One CD B-NP (S* - OUT (S*(S**S) - troubling VBG I-NP * - OUT (S**S) (S*aspect NN I-NP * - OUT (S**S) (S*of IN B-PP * - OUT (S**S) (S*DEC NNP B-NP * - OUT (S**S) (S*'s POS B-NP * - OUT (S**S) (S*results NNS I-NP * - OUT (S**S) (S*, , O * - OUT (S**S) (S*analysts NNS B-NP (S* - IN (S**S) (S*said VBD B-VP *S) say IN - - , , O * - OUT *S) *S) was VBD B-VP * - OUT *S) *S)its PRP$ B-NP * - OUT *S) *S) performance NN I-NP * - OUT *S) *S) in IN B-PP * - OUT *S) *S) Europe NNP B-NP * - OUT *S) *S) . . O *S) - OUT *S)*S) - CL pattern to sentence begin predicate CL pattern to sentence end
SUFFIXES • The confusion • B-AM-MNR B-AM-TMP • single word cases: fetchingly, tacitly, provocatively • suffixes of length 2-4 as features for head words are tried