Dependency Parsing with Reference to Slovene, Spanish and Swedish

Dependency Parsing with Reference to Slovene, Spanish and Swedish Simon Corston-Oliver Anthony Aue Microsoft Research

Noteworthy results • Slovene • Labeled DA = 72.42% (second) • Not significantly different from #1 (73.44%) • Swedish • #1 for unlabeled DA (89.54%) • Much worse than #3 for labeled DA(79.69% vs 82.31%)

Outline Two stage pipeline • Identify unlabeled directed dependencies • Label the dependencies

Parser • Unlabeled directed dependencies • Discriminatively trained linear classifier • Projective dependencies only • Parse features • Case-normalized surface form and lemma • POS of each token • POS of intervening and neighboring tokens • Combinations of these • Direction and distance of attachment

POS features • Use fine POS tags for all languages except Dutch and Turkish • Swedish: Normalize tags for auxiliaries • Orig: “vara” (be) = AV; “måst” (must) = MV • Replace with “aux” • Unlabeled DA: 89.23%  89.45%

Root identification features • Many errors identifying root in periphrastic constructions with aux and participle • E.g. German aux/modal in second position in declarative main clause; • initial with subj-aux inversion • New features: • POS sequence to left of each token • “Leftmost finite verb and not preceded by subordinating conj or relative pron” • “Sentence does (not) contain finite verb”

Root identification features • Danish improved • RA 94.12%  94.72% • Spanish improved • RA 80.08%  83.57%

Labeling dependencies • Use a maximum entropy classifier (Berger et al 1996) • Fast to train • Good probability estimates • Intended to jointly model sets of labels • Actually labeled independently • Better results with SVM?

Swedish using SVMs

Japanese using SVMs

Conclusion • Two stage pipeline • Feature engineering important • For predicting dependencies • For labeling dependencies • Replacing maxent classifier with SVM gave boost

Dependency Parsing with Reference to Slovene, Spanish and Swedish

Dependency Parsing with Reference to Slovene, Spanish and Swedish

Presentation Transcript

Computational Paninian Grammar for Dependency Parsing

Dependency Parsing: Machine Learning Approaches

Dependency Parsing by Belief Propagation

Dependency Parsing

Partial Dependency Parsing for Irish

Unsupervised Dependency Parsing

Data-Driven Dependency Parsing

Dependency Parsing

Dependency Hashing for n-best CCG Parsing

Dependency Parsing

Parsing with Soft and Hard Constraints on Dependency Length

Dependency Parsing as a Classification Problem

Dependency Parsing by Belief Propagation

DEPENDENCY PARSING ， Framenet , SEMANTIC ROLE LABELING, SEMANTIC PARSING

Lexical Dependency Parsing

Exploiting Reducibility in Unsupervised Dependency Parsing

The Exploration of Deterministic and Efficient Dependency Parsing

Gibbs Sampling with Treenes constraint in Unsupervised Dependency Parsing

Dependency Parsing as a Classification Problem

Unsupervised Dependency Parsing