130 likes | 159 Views
This study by Vincent Ng and Claire Cardie from Cornell University in ACL 2002 focuses on improving coreference resolution by incorporating better preprocessing methods, advanced search procedures, enhanced selection of positive examples, and the inclusion of additional lexical, grammatical, and semantic features. The presentation discusses how these improvements lead to higher F scores and better performance in identifying coreferent mentions.
E N D
Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Cornell Univ. ACL 2002 slides prepared by Ralph Grishman
Goal Improve on Soon et al. by • better preprocessing (chunking, names, …) • better search procedure for antecedent • better selection of positive examples • more features • more features • more features ...
Better search for antecedent • Soon et al. Use decision tree as binary classifier, take nearest antecedent classified as +ve • Ng&Cardie use same sort of classifier, but count +ve and -ve examples at each leaf, and use that to compute a probability • Ng&Cardie then take highest ranking antecedent (if probability > 0.5)
Better choice of positive examples • Soon et al. always use most recent antecedent • For Ng&Cardie, if anaphor is not a pronoun, they use most recent antecedent that is not a pronoun
More features #1 • Soon et al. Have a ‘same string’ feature • Ng&Cardie split this up into 3 features, for pronominals, nominals, and names
More features Added 41 more features: • lexical • grammatical • semantic
Lexical features (examples) • Non-empty overlap of words of two NPs • Prenominal modifiers of one NP are a subset of prenominal modifiers of other
Grammatical features (examples) • NPs are in predicate nominal construct • One NP spans the other • NP1 is a quoted string • One of the NPs is a title
Semantic features (examples) For nominals with different heads • direct or indirect hypernym relation in WordNet • distance of hypernym relation • sense number for hypernym relation
Selecting features • Full feature set yielded very low precision on nominal anaphors • overtraining: too many features for too little data • So they (manually) eliminated many features which led to low precision (on training data) • no ‘development set’ separate from training and test sets