Progress update

Progress update Lin Ziheng

System overview

Components – Connective classifier • Features from Pitler and Nenkova(2009): • Connective: because • Self category: IN • Parent category: SBAR • Left sibling category: none • Right sibling category: S • Right sibling contains a VP: yes

Components – Connective classifier • New features • Conn POS • Prev word + conn: even though, particularly since • Prev word POS • Prev word POS + conn POS • Conn + Next word • Next word POS • Conn POS + Next word POS • All lemmatized verbs in the sentence containing conn

Components – Argument labeler

Argument labeler – Argument position classifier • Relative positions of Arg1 • Arg1 and Arg2 in the same sentence: SS (60.9%) • Arg1 in the immediately previous sentence: IPS (30.1%) • Arg1 in some non-adjacent previous sentence: NAPS (9.0%) • Arg1 in some following sentence: FS (0%, only 8 instances) • FS ignored

Argument labeler – Argument position classifier • Features: • Connective string • Conn POS • Conn position in the sentence: first, second, third, third last, second last, or last • Prev word • Prev word POS • Prev word + conn • Prev word POS + conn POS • Second prev word • Second prev word POS • Second prev word + conn • Second prev word POS + conn POS

Argument labeler – Argument extractor • SS cases: handcrafted a set of syntactically motivated rules to extract Arg1 and Arg2

Argument labeler – Argument extractor • An example:

Argument labeler – Argument extractor • IPS cases: label the sentence containing the connective as Arg2 and the immediately previous sentence as Arg1 • NAPS cases: • Arg1 locates in the second previous sentence in 45.8% of the NAPS cases • Use the majority decision and assume Arg1 is always in the second previous sentence

Components – Explicit classifier • Prasad et al. (2008) reported human agreements of 94% on Level 1 classes and 84% on Level 2 types • A baseline using only connectives as features gives 95.7% and 86% on Sec. 23 • Difficult to improve acc. on testing section • 3 types of features: • Connective string • Conn POS • Conn + prev word

Components – Non-explicit classifier • Non-explicit: Implicit, AltLex, EntRel, NoRel • 11 Level 2 types for Implicit/AltLex, plus EntRel and NoRel 13 types • 4 feature sets from Lin et al. (2009) • Contextual features • Constituent parse features • Dependency parse features • Word-pair features • 3 features to capture AltLex: Arg2_word1, Arg2_word2, Arg2_word3

Components – Attribution span labeler • Two steps: split the text into clauses, and decide which clauses are attribution spans • Rule-based clause splitter: • first split a sentence into clauses by punctuations • for each clause, we further split it if one of the following production links if found: VPSBAR, SSINV, SS, SINVS, SSBAR, VPS

Components – Attribution span labeler • Attr span classifier features: (curr, prev and next clauses) • Unigrams of curr • Lowercased and lemmatized vers in curr • The first and last terms of curr • The last term of prev • The first term of next • The last term of prev + the first term of curr • The last term of curr + the first term of next • The position of curr in the sentence • Punctuations rules extracted from curr

Evaluation • Train: 02-21, dev: 22, test: 23 • Each component is tested • without and with error propagation (EP) from previous component • with gold standard (GS) parse trees and sentence boundaries, and with automatic (Auto) parser and sentence splitter

Evaluation – Connective classifier • GS: increased acc and F1 by 2.05% and 3.05% • Auto: increased acc and F1 by 1.71% and 2.54% • Contextual info is helpful

Evaluation – Argument position classifier • Able to accurately label SS • But performs badly on the NAPS class • Due to the similarity between IPS and NAPS classes

Evaluation – Argument extractor • Human agreements on partial and exact matches: 94.5% and 90.2% • Exact F1 much lower than partial F1 • Due to small portions of text deleted

Evaluation – Explicit classifier • Baseline: using only connective strings • 86% • GS + no EP F1 increased by 0.44%

Evaluation – Non-explicit classifier • Majority baseline: all classified as EntRel • Adding EP degrades F1 by ~13%, but still outperforms baseline by ~6%

Evaluation – Attribution span labeler • When EP added: the decrease of F1 is largely due to the drop in precision • When Auto added: the decrease of F1 is largely due the drop in recall

Evaluation – The whole pipeline • Definition: a relation is correct if its relation type is classified correctly, and both Arg1 and Arg2 are partially or exactly matched • GS + EP • Partial: 46.38% F1 • Exact: 31.72% F1

On-going changes • Joint learning • Change rule-based argument extractor to a machine learning approach

Progress update

Progress update

Presentation Transcript

Progress Update

FIA Progress Update

Progress Update

Progress Software Update

Progress Update On

Progress Software Update

PRECLEARANCE Progress Update

Progress Update: RIDFT

Graphite progress update

Graphite progress update

HYP Progress Update

Progress update

Progress update

Progress Update

Progress Update

Progress Update

Progress Update

Progress Update

PARCC Progress Update

Progress update

Progress Update

Progress Update