190 likes | 338 Views
Improved Semantic Role Parsing. Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward, Dan Jurafsky, James Martin Center for Spoken Language Research University of Colorado Boulder, CO. What is Semantic Role Tagging?.
E N D
Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward, Dan Jurafsky, James Martin Center for Spoken Language Research University of Colorado Boulder, CO AQUAINT Workshop – June 2003
What is Semantic Role Tagging? • Assigning semantic labels to sentence elements. • Elements are arguments of some predicate or participants in some event. • Who did What to Whom, How, When, Where, Why [TEMPORALIn 1901][THEMEPresident William McKinley][TARGETwas shot][AGENTby anarchist Leon Czolgosz][LOCATIONat the Pan-American Exposition] AQUAINT Workshop – June 2003
Parsing Algorithm From Gildea and Jurafsky (2002) • Generate syntactic parse of sentence (Charniak) • Specify predicate (verb) • For each constituent node in parse tree: • Extract features relative to predicate • Path, Voice, Headword, Position, Phrase Type, Sub-Cat • Estimate P(Role| features) for each role and normalize • Assign role with highest probability AQUAINT Workshop – June 2003
SVM Classifier • Same basic procedure as (Gildea & Jurafsky 2000) • Same features except include predicate as feature • Change classification step to use SVM • TinySVM software [Kudo & Matsumoto 2000] • Prune constituents with P(Null) > 0.98 • For efficiency in training • Prunes ~ 80% of constituents • For each role train one-vs-all classifier • Includes Null role AQUAINT Workshop – June 2003
SVM Classification • Generate syntactic parse (Charniak parser) • For each target (verb) • Prune constituents with P(Null) > 0.98 • Run each ova classifier on remaining constituents • Convert SVM output to probs by fitting sigmoid • Described in Platt 2000 • Generate N-best labels for each constituent • Pick highest prob sequence of non-overlapping roles AQUAINT Workshop – June 2003
Features • Target word (verb) • Cluster for target word (64) • Path from cons to target • Phrase Type • Position (before/after) • Voice • Head Word • Sub-categorization Path: NP S VP VB Head Word: He Sub-cat: VP VB NP AQUAINT Workshop – June 2003
Role Labels PropBank Arguments Thematic Roles Arg0 ArgM-ADV Arg1 ArgM-CAU Arg2 ArgM-DIR Arg3 ArgM-DIS Arg4 ArgM-EXT Arg5 ArgM-LOC ArgA ArgM-MNR ArgM ArgM-MOD ArgM-REC ArgM-NEG ArgM-PRD ArgM-PRP ArgM-TMP • Agent • Actor • Beneficiary • Cause • Degree • Experiencer • Goal • Instrument • Location • Manner • Means • Proposition • Result • State • Stimulus • Source • Temporal • Theme • Topic • Type • Other AQUAINT Workshop – June 2003
Data • PropBank data • WSJ section of Penn TreeBank • Annotated with Predicate-Argument • Train on PropBank Training Set • Section 00, 23 witheld • 72,000 annotated roles • Test on PropBank section-23 • 3,800 annotated roles AQUAINT Workshop – June 2003
SVM Performance Annotate PropBank Arguments Gold-Standard Parses from TreeBank AQUAINT Workshop – June 2003
Using Real Parses Annotate PropBank Arguments AnnotateThematic Roles AQUAINT Workshop – June 2003
ID and Label ID and Annotate Thematic Roles Using Charniak Parse Top N Classification AQUAINT Workshop – June 2003
Hard vs Soft Pruning • Soft Pruning • Train Null-vs-Role classifier on all data • Prune constituents with P(Null) > 0.98 • Train ova classifiers (incl Null) on remaining constituents • Hard Pruning • Train Null-vs-Role classifier on all data • Make Null-vs-Role classification for each constituents • Train ova classifiers (no Null) on role constituents AQUAINT Workshop – June 2003
Segment & Classify with SVM • Initial system used Charniak parser to segment • SVM classified segmented constituents • Use SVM to segment and classify chunks • Features: • Window of 5 words (+2,target,-2) • POS tags for words • Syntactic phrase position tags (B,I,O) • Path from word to target • Class assignments for previous words • Assign Semantic phrase position tag to each word AQUAINT Workshop – June 2003
SVM Chunking Parser Features words Target word detector target word path for each word Syntactic Parser Path Finder input sentence POS tags word positions Chunker Active Passive Detector voice AQUAINT Workshop – June 2003
Example I But analystssayIBM is a special case Word POS SPP Path Pr B/A V Class But CC O CC<-S->VP->VBP say B A O analysts NNS B-NP NNS<-NP<-S->VP->VBP say B A B-agent IBM NNP B-NP VBP<-VP->SBAR->S->NP->NNP say A A B-topic is AUX O VBP<-VP->SBAR->S->VP->AUX say A A I-topic a DT B-NP VBP<-VP->SBAR->S->VP->NP->DT say A A I-topic special JJ I-NP VBP<-VP->SBAR->S->VP->NP->JJ say A A I-topic case NN I-NP VBP<-VP->SBAR->S->VP->NP->NN say A A I-topic But [AGENT analysts][TARGET say][TOPIC IBM is a special case] AQUAINT Workshop – June 2003
SVM Chunking Parser II Features words Target word detector target word path for each word POS tagger Path Finder POS tags input sentence Yamcha Chunker word positions Active Passive Detector voice AQUAINT Workshop – June 2003
Example II But analystssayIBM is a special case POS tagged & Chunked (only NP and VP) But_CC[NP analysts_NNS](VP say_VBP )[NP IBM_NNP ](VP is_VBZ )[NP a_DT special_JJ case_NN ] Word POS SPP Path Pr B/A V Class But CC O CC->NP->VP->VBP say B A O analysts NNS B-NP NNS->NP->VP->VBP say B A B-agent IBM NNP B-NP NNP->NP->VP->VBP say A A B-topic is VBZ B-VP VBZ->VP->NP->VP->VBP say A A I-topic a DT B-NP DT->NP->VP->NP->VP->VBP say A A I-topic special JJ I-NP JJ->NP->VP->NP->VP->VBP say A A I-topic case NN I-NP NN->NP->VP->NP->VP->VBP say A A I-topic AQUAINT Workshop – June 2003
Train on only first 3000 sentences PropBank data Performance Segment & Annotate Thematic Roles Chunker-ISyntax features derived from Charniak parse Chunker-IISyntax features from syntactic SVM chunker AQUAINT Workshop – June 2003
Summary and Future Work • Project has shown continued improvement in semantic parsing • Goals: • Improve accuracy through new features • Improve robustness to data sets by improving word sense robustness • Continue experiments without full syntactic parse • Apply to Question Answering AQUAINT Workshop – June 2003