Automatic classification for implicit discourse relations

Automatic classification for implicit discourse relations Lin Ziheng

PDTB and discourse relations • Explicit relations • Arg1: The bill intends to restrict the RTC to Treasury borrowings only, Arg2:unless the agency receives specific congressional authorization. (Alternative) (wsj_2200) • Implicit relations • Arg1: The loss of more customers is the latest in a string of problems. • Arg2:[for instance] Church's Fried Chicken Inc. and Popeye's Famous Fried Chicken Inc., which have merged, are still troubled by overlapping restaurant locations. (Instantiation) (wsj_2225)

PDTB and discourse relations (2) • PDTB hierarchy of relation classes, types and subtypes

PDTB and discourse relations (3) • Level-2 relation types, on implicit dataset from the training sections (sec. 2 - 21) • Remove Condition, Pragmatic Condition, Pragmatic Contrast, Pragmatic Concession and Exception • 11 relation types remained • Dominating types: • Cause • Conjunction • Restatement

Contextual features r1 r2 • Arg1:Tokyu Department Store advanced 260 to 2410. Arg2:[and]Tokyu Corp. was up 150 at 2890. (List) (wsj_0374) • Arg1:Tokyu Department Store advanced 260 to 2410. Tokyu Corp. was up 150 at 2890. Arg2:[and]Tokyu Construction gained 170 to 1610. (List) (wsj_0374) Shared argument r1.Arg1 r1.Arg2 r2.Arg1 r2.Arg2 r2 Fully embedded argument r1 r1.Arg1 r1.Arg2 r2.Arg2 r2.Arg1

Contextual features (2) • For each relation curr, look at the surrounding two relations prev and next, giving to a total of six features First figure in previous slide where curr = r2 Second figure in previous slide where curr = r2

Syntactic Features • Arg1: "The HUD budget has dropped by more than 70% since 1980," argues Mr. Colton. Arg2:[so] "We've taken more than our fair share. (Cause) (wsj_2227)

Syntactic Features (2) • Collect all production rules: • Ignore function tags, such as -TPC, -SBJ, -EXT • From Arg1: S  NP VP, NP  DT NNP NN, VP  VBZ VP, VP  VBN PP PP, PP  IN NP, NP  QP NN, QP  JJ IN CD, NP  CD, DT  The, NNP  HUD, NN  budget, VBZ  has, VBN  dropped, IN  by, JJ  more, IN  than, CD  70, NN  %, IN  since, CD  1980 • From Arg2: S  `` NP VP ., NP  PRP, VP  VBP VP, VP  VBN NP, NP  NP PP, NP  JJR, PP  IN NP, NP  PRP$ JJ NN, ``  ``, PRP  We, VBP ‘ve, VBN  taken, JJR  more, IN  than, PRP$  our, JJ  fair, NN  share, .  .

Dependency features

Dependency features (2) • Collect all words with dependency types from their dependents • From Arg1: budget  detnn, dropped  nsubj aux prep prep, by  pobj, than  advmod, 70  quantmod, %  num, since  pobj, argues  ccompnsubj, Colton  nn • From Arg2: taken  nsubj aux dobj, more  prep, than  pobj, share  possamod

Lexical features • Collect all word pairs from Arg1 and Arg2, i.e., all (wi, wj) where wi is a word from Arg1 and wj is a word from Arg2

Experiments • Classifier: OpenNLPMaxEnt • Training data: sections 2 – 21 • Test data: section 23 • Use Mutual Information(MI) to rank features for production rules, dependency rules and word pairs separately • Majority baseline: 26.1%, where all instances are classified into Cause

Experiments (2) • Use contextual features and one other feature class • context + production rules • context + dependency rules • context + word pairs

Experiments (3) • With large numbers of features • context + all production rules: 36.68% • context + all dependency rules: 27.94% • context + 10,000 word pairs: 35.25%

Experiments (4) • Combine all feature classes, got an accuracy of 40.21%. • The following shows that all feature classes contribute to the performance

Automatic classification for implicit discourse relations