90 likes | 104 Views
A Deterministic Co-reference System with Rich Syntactic Features and Semantic Knowledge. Heeyoung Lee & Sudarshan Rangarajan Collaborators : Karthik Raghunathan under the guidance of Mihai Surdeanu, Nate Chambers & Dan Jurafsky. The Problem. boolean isCoreferent(Mention A, Mention B).
E N D
A Deterministic Co-reference System with Rich Syntactic Features and Semantic Knowledge Heeyoung Lee & Sudarshan Rangarajan Collaborators : Karthik Raghunathan under the guidance of Mihai Surdeanu, Nate Chambers & Dan Jurafsky
The Problem • boolean isCoreferent(Mention A, Mention B) ‘More important to the future of 8mm is Sony's success in the $2.3 billion camcorder market. The Japanese company already has 12% of the total camcorder market, ranking it third behind the RCA and Panasonic brands.’ • isCoreferent(‘Sony’, ‘The Japanese Company’) : TRUE • isCoreferent(‘The Japanese Company’, ‘it’) : TRUE • isCoreferent(‘Sony’, ‘it’) : TRUE • isCoreferent(‘it’, ‘camcorder market’) : FALSE • isCoreferent(‘it’, ‘RCA’) : FALSE
Baseline System • Simple Co-reference Resolution with Rich Syntactic and Semantic Features, by Aria Haghighi & Dan Klein (EMNLP 2009) • Deterministic, single-pass, constraint-based system • Included Syntactic salience & Agreement constraint checking. • Lack of Semantic Knowledge in decision making. ‘President Bush and his colleague had different opinions. However, the person who has the right to make the final decision is the president.’
Preliminary Error Analysis Corpora for Error Analysis : MUC-6 (Train Set); and for Experiments : MUC-6 & ACE
Simple Knowledge Extraction System (SKES) Seed & Mention Pairs Yield Semantic Patterns Yield Metrics used to refine pattern yield
Construct passes • Sort decision features – Highest precision first.
Multi-pass Coreference System • Deterministic, multi-pass, constraint based system. • Decisions based on more confident mention pairs first. • Further decisions based on previously accumulated knowledge about mentions. ‘President Bush and his colleague had different opinions. However, the person who has the right to make the final decision is the president.’
Result • Multi-pass system is more sensitive to error propagation -> need high precision passes. • Higher precision, but lower recall and F1. • Needs more passes to increase recall -> Future work ( & Co-reference decision Re-ranker)
Questions? Thank You!