230 likes | 305 Views
ACL - 2006. A Fast, Accurate Deterministic Parser for Chinese. MengqiuWang Kenji Sagae Teruko Mitamura Language Technologies Institute School of Computer Science Carnegie Mellon University {mengqiu,sagae,teruko}@cs.cmu.edu. Advisor: Hsin-Hsi Chen Speaker: Yong-Sheng Lo Date: 2007/07/26.
E N D
ACL - 2006 A Fast, Accurate Deterministic Parser for Chinese MengqiuWang Kenji Sagae Teruko Mitamura Language Technologies Institute School of Computer Science Carnegie Mellon University {mengqiu,sagae,teruko}@cs.cmu.edu Advisor: Hsin-Hsi Chen Speaker: Yong-Sheng Lo Date: 2007/07/26
Agenda Word Segmentation • Introduction • Deterministic parsing model • The parsing task the classification task • The shift/reduce decision • Four classifiers • Feature selection • POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Experiments • Conclusion POS tagging Parsing
Introduction • Traditional statistical approaches • To build models which assign probabilities to every possible parse tree for a sentence • Techniques • Such as dynamic programming, beam-search, and best-first-search are then employed to find the parse tree with the highest probability • Disadvantage • Too slow for many practical applications
Introduction • Deterministic parsing model • The parsing task the classification task • The shift/reduce decision • Four classifiers • Feature selection • POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Experiments • Conclusion
Deterministic Parsing Model • Deterministic parsing model : • Input is a sentence • has already been segmented and tagged with part-of-speech (POS) information • Data structure • Queue : To store the input word-POS tag pairs (ex.上海-NR) • Stack : To hold the partial trees that are built during parsing • At each parse state • The classifier makes shift/reduce decision based on contextual features • Output is a full parsing tree
Introduction • Deterministic parsing model • The parsing task the classification task • The shift/reduce decision • Four classifiers • Feature selection • POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Experiments • Conclusion
The shift/reduce decision • Four parsing actions : (Sagae and Lavie, 2005) • Shift • To remove the first item on the queue and put it onto the stack • Reduce-Unary-X • To remove one item from the stack • X is the label of a new tree node that will be dominating the removed item • Reduce-Binary-X-Left • To remove two item from the stack • To take the head-node of the left sub-tree to be the head of the new tree • Reduce-Binary-X-Right • To remove two item from the stack • To take the head-node of the right sub-tree to be the head of the new tree
For example 1/5 • Input : 布朗(NR) 訪問(VV) 上海(NR) • The parse state : Initialization • Action : • Shift
For example 2/5 • The parse state : 3 • Action : • Shift • The parse state : 2 • Action : • Reduce-Unary-NP
For example 3/5 • The parse state : 5 • Action : • Reduce-Unary-NP • The parse state : 4 • Action : • Shift
For example 4/5 • The parse state : 7 • Action : • Reduce-Binary-IP-Right • The parse state : 6 • Action : • Reduce-Binary-VP-Left
For example 5/5 • The parse state : final
Introduction • Deterministic parsing model • The parsing task the classification task • The shift/reduce decision • Four classifiers • Feature selection • POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Experiments • Conclusion
Four classifiers • Support Vector Machine • The TinySVM toolkit (Kudo andMatsumoto,2000) • Maximum-Entropy Classifier • The Le’s Maxent toolkit (Zhang, 2004) • Decision Tree Classifier • The C4.5 decision tree classifier • Memory-Based Learning • The TiMBL toolkit (Daelemans et al., 2004)
Introduction • Deterministic parsing model • The parsing task the classification task • The shift/reduce decision • Four classifiers • Feature selection • POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Experiments • Conclusion
Introduction • Deterministic parsing model • The parsing task the classification task • The shift/reduce decision • Four classifiers • Feature selection • POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Experiments • Conclusion
Word Segmentation POS tagging POS tagging • Using gold-standard POS tags • A simple POS tagger using an SVM classifier • Using gold-standard POS tags to train SVM • Using a simple POS tagger • Two passes • Pass 1 : To extract features from the two words and POS tags that came before the current word, the two words following the current word, and the current word itself • Then the tag is assigned to the word according to SVM classifier’s output • Pass 2 : Additional features such as the POS tags of the two words following the current word, and the POS tag of the current word (assigned in the first pass) are used • This tagger had a measured precision of 92.5% for sentences <= 40 words. Parsing
Experiments • Corpus • Penn Chinese Treebank • Training : Sections 001-270 (3484 sentences, 84,873 words) • Development : 271-300 (348 sentences, 7980 words) • Testing : 271-300 (348 sentences, 7980 words) • 99629 words • Evaluation • Labeled recall (LR) • Labeled precision (LP) • F1 score (harmonic mean of LR and LP)
Experiments • Results of different classifiers • On development set for sentence <= 40 words
Experiments • Comparison with Related work • On the test set
Stacked classifier model Using Maxent, DTree and TiNBL outputs as features, in addition to the original feature set, to train a new SVM model on the original training set Experiments • Using gold-standard POS
Conclusion • To present a novel classifier-based deterministic parser for Chinese constituency parsing • The best model runs in linear time and has labeled precision and recall above 88% using gold-standard part-of-speech tags • The SVM parser is 2-13 times faster than state-of-the-art parsers, while producing more accurate results • The Maxent and DTree parsers run at speeds 40-270 times faster than state-of-the-art parsers, but with 5-6% losses in accuracy