Forest-based Semantic Role Labeling

Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy of Chinese Sciences AAAI 2010, Atlanta

Semantic Role Labeling • Given a sentence and its verbs • Identify the arguments of the verbs • Assign semantic labels (the roles they play) This company last year 1000 cars in the U.S. This company last year sold 1000 cars in the U.S. ArgMod -LOCation Agent Patient ArgMod -TeMPoral PropBank (Kingsbury and Palmer 2002)

One Conventional Approach therole ofCelimene is played by Kim Cattrall Agent Patient AAAI 2010, Atlanta

One Conventional Approach S VP NP VP PP NP AUX VBN PP therole ofCelimene is played by Kim Cattrall Patient Agent AAAI 2010, Atlanta

One Conventional Approach more than 15% S ? VP VP PP NP AUX VBN PP therole ofCelimene is played by Kim Cattrall Patient Agent AAAI 2010, Atlanta

… Solution • k-best parses: • limited scope: k • too much redundancy S 25<50<26 … VP 1 2 k 3 NP VP PP NP AUX VBN PP S VP VP PP NP AUX VBN PP … AAAI 2010, Atlanta

Our Solution • Forest • A compact representation of many parses • By sharing common sub-derivations • Polynomial-space encoding of exponentially large set S VP NP … VP PP NP AUX VBN PP S VP S VP Unpack PP NP AUX VBN PP VP NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Our Solution • Forest • A compact representation of many parses • By sharing common sub-derivations • Polynomial-space encoding of exponentially large set S VP NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Outline • Tree-based Semantic Role Labeling • Parsing • Selecting candidates • Extracting features • Classifying • Forest-based Semantic Role Labeling • Experiments • Conclusion AAAI 2010, Atlanta

Parsing S NP VP NP DT NN JJ NN VBD NP PP CD NNS IN NP This company last year sold 1000 cars in DT NNP the U.S. AAAI 2010, Atlanta

Selecting Candidates S NP VP NP DT NN JJ NN VBD NP PP CD NNS IN NP This company last year sold 1000 cars in DT NNP the U.S. AAAI 2010, Atlanta

Extracting Features Path to the predicate NPSVPVBN S NP VP NP DT NN JJ NN VBD NP PP CD NNS NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP the the U.S. U.S. AAAI 2010, Atlanta

Extracting Features Position: left NPSVPVBN S left NP VP NP DT NN JJ NN VBD NP PP CD NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP the the U.S. U.S. AAAI 2010, Atlanta

Extracting Features Head word: company NPSVPVBN S left company NP VP NP DT NN JJ NN VBD NP PP CD NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP the the U.S. U.S. AAAI 2010, Atlanta

Extracting Features Head POS tag: NN NPSVPVBN S left company NN NP VP NP … DT NN JJ NN VBD NP PP CD NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP the the U.S. U.S. AAAI 2010, Atlanta

Classifying S(Agent)=0.8 S(Patient)=0.1 S(None)=0.1 … Computing Score using a trained classifier S(AM-LOC)=0.9 S(Agent)=0.1 S(None)=0.1 … S NP VP NP DT NN JJ NN VBD NP PP CD NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP S(Agent)=0.1 S(Patient)=0.1 S(None)=0.5 … S(AM-TMP)=0.9 S(Patient)=0.1 S(None)=0.1 … S(Agent)=0.2 S(Patient)=0.8 S(None)=0.1 … the the U.S. U.S.

Classifying S(Agent)=0.8 … Best score for each constituent Simply sort them Choose the best label sequence S(AM-LOC)=0.9 … S NP VP NP DT NN JJ NN VBD NP NP PP CD NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP S(None)=0.5 … S(AM-TMP)=0.9 … S(Patient)=0.8 … the the U.S. U.S.

Classifying S NP VP NP DT NN JJ NN VBD NP PP CD NNS IN NP This This company company last last year year sold 1000 1000 cars cars in in DT NNP Agent AM-TMP V Patient the the U.S. U.S. AM-LOC

Outline • Tree-based Semantic Role Labeling • Forest-based Semantic Role Labeling • Parsing into a forest • Selecting candidates • Extracting features on forest • Classifying • Experiments • Conclusion AAAI 2010, Atlanta

Forest Hyper-graph Hyper-edge Node S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Selecting Candidates S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Exacting features Path to the predicate NPNPSVPVPVBN S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Exacting features Path to the predicate NPNPSVPVPVBN shortest NPSVPVPVBN S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Exacting features Parent Label NPSVPVPVBN S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Exacting features Parent Label in the shortest path NPSVPVPVBN S S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

New Features • Parsing score (Fractional value (Mi et al., 2008)) • Inside-outside • Marginal prob. NPSVPVPVBN S f(NP3) S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Classifying S(Patient)=0.8 S(Agent)=0.1 S(None)=0.2 … S(Agent)=0.8 S(Patient)=0.1 S(None)=0.2 … S(Patient)=0.5 S(Agent)=0.1 S(None)=0.3 … S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta NP PP AUX VBN PP

Classifying S(Patient)=0.8 … S(Agent)=0.8 … S VP therole ofCelimene is played by Kim Cattrall NP VP AAAI 2010, Atlanta Patient Agent NP PP AUX VBN PP

Outline • Tree-based Semantic Role Labeling • Forest-based Semantic Role Labeling • Experiments • Conclusion AAAI 2010, Atlanta

Experiments • Corpus: CoNLL-2005 shared task • Sections 02-21 of PropBank for training • Section 24 for development set • Section 23 for test set • Total • 43,594 sentences • 262,281 arguments AAAI 2010, Atlanta

Experiments • Training sentences • Parse into 1-best and forest • Prune forest using inside-outside algorithm • Train classifiers • Decoding sentences • Parse into 1-best and forest • Prune forest using inside-outside algorithm • Use classifiers AAAI 2010, Atlanta

Features • Predicate lemma • Path to predicate • Path length • Partial path • Position • Voice • Head word/POS tag • … AAAI 2010, Atlanta

Results on Dev Set 9.63×105 5.78×106 1-best forest(p3) 50-best forest(p5) precision F recall

Results on Tst Set AAAI 2010, Atlanta

Outline • Tree-based Semantic Role Labeling • Forest-based Semantic Role Labeling • Experiments • Conclusion AAAI 2010, Atlanta

Conclusion • Forest • Exponentially encode many parses • Enlarge the candidate space • Explore more rich features • Improve the quality significantly • Not necessary using very large forest • Can NOT use k-best to simulate • Future works • Features on forest AAAI 2010, Atlanta

Thank you! Patient AAAI 2010, Atlanta

Forest-based Semantic Role Labeling

Forest-based Semantic Role Labeling

Presentation Transcript

CS 388: Natural Language Processing: Semantic Role Labeling

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS

Semantic Role Labeling

Semantic Role Labeling of Implicit Arguments for Nominal Predicates

On Labeling Schemes for the Semantic Web

Starting from Scratch in Semantic Role Labeling

Semantic Role Labeling

Automatic Semantic Role Labeling

Generalized Inference with Multiple Semantic Role Labeling Systems

A Memory-Based Approach to Semantic Role Labeling

Semantic-based Architectures

Two-Phase Semantic Role Labeling based on Support Vector Machines

Class-based nominal semantic role labeling: a preliminary investigation

DEPENDENCY PARSING ， Framenet , SEMANTIC ROLE LABELING, SEMANTIC PARSING

Semantic Role Labeling with support vector machines

Automatic Labeling of Semantic Roles

CS 388: Natural Language Processing: Semantic Role Labeling

Semantic Role Labeling on Nouns

Robust Semantic Role Labeling for Nominals