100 likes | 272 Views
An SVMs Based Multi-lingual Dependency Parsing. Yuchang CHENG, Masayuki ASAHARA and Yuji MATSUMOTO Nara Institute of Science and Technology. Approaches to Dependency Parsing. Bottom-up deterministic (local discrimination)
E N D
An SVMs Based Multi-lingual Dependency Parsing Yuchang CHENG, Masayuki ASAHARA and Yuji MATSUMOTO Nara Institute of Science and Technology
Approaches to Dependency Parsing • Bottom-up deterministic (local discrimination) • Iterative, projective [Kudo & Matsumoto 02][Yamada & Matsumoto 03][Cheng, Asahara, Matsumoto 04] • Shift-reduce, projective [Nivre, Scholz 04] • Shift-reduce, pseudo-projective [Nivre, Nilsson 05] • N-best + Large margin discrimination (global discrimination) • Projective [McDonald, Crammer, Pereira 05] • Non-projective[McDonald, Pereira, Ribarow, Hajic 05]
Comparison betweenIterative and Shift-reduce methods • Nivre algorithm (Shift-reduce) • depth first • O(n) • Iterative • breadth first • O(n2):worst case, empirically near linear + efficient - limited look-ahead Training and parsing are done in the same process ⇒ Number of training instances = Number of parsing steps
consulted context consulted context Limited right-side contextual info. • A configuration in Nivre method • A configuration in Y&M method I saw a girl with a telescope. saw girl with telescope. I a a
Preliminary comparison • English dependency parsing (Penn Treebank 02-06:training, 23:test) • right context = 2 • right context = 4 Chinese case: Almost no difference/ a little better result in Nivre method
Common Disadvantage • Local discrimination • Single model throughout whole sentence • local (near leaves) and long-distance (near top) parsing should be different models • Distinct model at the lowest level • dependency between adjacent words • implemented as a pre-processing
saw girl with telescope. I a a consulted context Shallow pre-processing + Nivre method • Preprocessing of adjacent words • Then, apply Nivre method • Labels are decided by MaxEnt classifiers I saw a girl with a telescope.
Speed-up of Kernel SVM Fast methods for kernel-based text analysis [Kudo & Matsumoto 04] • Training with 3rd degree polynomial Kernel • Mining of feature combinations in positive/negative support vectors • Linearization with obtained feature combinations (20-200 times speed up)