Improving Machine Learning Approaches to Coreference Resolution

Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Cornell Univ. ACL 2002 slides prepared by Ralph Grishman

Goal Improve on Soon et al. by • better preprocessing (chunking, names, …) • better search procedure for antecedent • better selection of positive examples • more features • more features • more features ...

Better search for antecedent • Soon et al. Use decision tree as binary classifier, take nearest antecedent classified as +ve • Ng&Cardie use same sort of classifier, but count +ve and -ve examples at each leaf, and use that to compute a probability • Ng&Cardie then take highest ranking antecedent (if probability > 0.5)

Better choice of positive examples • Soon et al. always use most recent antecedent • For Ng&Cardie, if anaphor is not a pronoun, they use most recent antecedent that is not a pronoun

More features #1 • Soon et al. Have a ‘same string’ feature • Ng&Cardie split this up into 3 features, for pronominals, nominals, and names

First improvements: F scores

More features Added 41 more features: • lexical • grammatical • semantic

Lexical features (examples) • Non-empty overlap of words of two NPs • Prenominal modifiers of one NP are a subset of prenominal modifiers of other

Grammatical features (examples) • NPs are in predicate nominal construct • One NP spans the other • NP1 is a quoted string • One of the NPs is a title

Semantic features (examples) For nominals with different heads • direct or indirect hypernym relation in WordNet • distance of hypernym relation • sense number for hypernym relation

Selecting features • Full feature set yielded very low precision on nominal anaphors • overtraining: too many features for too little data • So they (manually) eliminated many features which led to low precision (on training data) • no ‘development set’ separate from training and test sets

Adding features: F scores

Improving Machine Learning Approaches to Coreference Resolution

Improving Machine Learning Approaches to Coreference Resolution

Presentation Transcript

Supervised models for coreference resolution

Error Analysis for Learning-based Coreference Resolution

Dependency Parsing: Machine Learning Approaches

Easy-First Coreference Resolution

Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Department of Computer Scie

Decision Trees for Coreference Resolution

Coreference Resolution

A Global Relaxation Labeling Approach to Coreference Resolution

Memory-based learning for noun phrase coreference resolution

Inference Protocols for Coreference Resolution

Exploring Unsupervised and Knowledge-Rich Approaches to Coreference Resolution.

Graph-based Event Coreference Resolution

Learning noun phrase coreference resolution

Unsupervised Models for Coreference Resolution

Comparing learning approaches to coreference resolution. There is more to it than ‘bias’.

Learning Dutch noun phrase coreference resolution

Coreference Resolution using Web-Scale Statistics

Learning noun phrase coreference resolution

A Machine Learning Approach to Coreference Resolution of Noun Phrases

Using MapReduce for Scalable Coreference Resolution

Machine Learning Approaches for Demand Forecasting