210 likes | 377 Views
Detecting Anaphoricity and Antecedenthood for Coreference Resolution. Olga Uryupina ( uryupina @ gmail . com ) Institute of Linguistics, RAS 13.11.08. Overview. Anaphoricity and Antecedenthood Experiments Incorporating A&A detectors into a CR system Conclusion. A&A: example.
E N D
Detecting Anaphoricity and Antecedenthood for Coreference Resolution Olga Uryupina (uryupina@gmail.com) Institute of Linguistics, RAS 13.11.08
Overview • Anaphoricity and Antecedenthood • Experiments • Incorporating A&A detectors into a CR system • Conclusion
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
Anaphoricity Likely anaphors: - pronouns, definite descriptions Unlikely anaphors: - indefinites Unknown: - proper names Poesio&Vieira: more than 50% of definite descriptions in a newswire text are not anaphoric!
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
Antecedenthood Related to referentiality (Karttunen, 1976): „no debt“ etc Antecedenthood vs. Referentiality: corpus-based decision
Experiments • Can we learn anaphoricity/antecedenthood classifiers? • Do they help for coreference resolution?
Methodology • MUC-7 dataset • Anaphoricity/antecedenthood induced from the MUC annotations • Ripper, SVM
Features • Surface form (12) • Syntax (20) • Semantics (3) • Salience (10) • „same-head“ (2) • From Karttunen, 1976 (7) 49 features – 123 boolean/continuous
Integrating A&A into a CR system Apply an A&A prefiltering before CR starts: • Saves time • Improves precision Problem: we can filter out good candidates..: - Will loose some recall
Oracle-based A&A prefiltering Take MUC-based A&A classifier („gold standard“ CR system: Soon et al. (2001) with SVMs MUC-7 validation set (3 „training“ documents)
Automatically induced classifiers Precision more crucial than Recall Learn Ripper classifiers with different Ls (Loss Ratio)
Conclusion Automatically induced detectors: • Reliable for anaphoricity • Much less reliable for antecedenthood (a corpus, explicitly annotated for referentiality could help) A&A prefiltering: • Ideally, should help • In practice – substantial optimization required