70 likes | 166 Views
Machine Translation via Dependency Transfer. Philip Resnik University of Maryland. DoD MURI award in collaboration with JHU: Bootstrapping Out of the Multilingual Resource Bottleneck Start date: May, 2001. Current statistical MT. IBM models 1-5 Fail to model syntactic dependencies
E N D
Machine Translation via Dependency Transfer Philip Resnik University of Maryland DoD MURI award in collaboration with JHU: Bootstrapping Out of the Multilingual Resource Bottleneck Start date: May, 2001
Current statistical MT • IBM models 1-5 • Fail to model syntactic dependencies • Don’t take advantage of morphological features • Bilingual grammar approaches • Have not evolved into a stochastic setting (SyTAG) • Model constituency rather than dependency (SITG) • Dependency transduction models • Are linguistically underconstrained • Don’t take advantage of asymmetrical resources
subj obj prp vbd dt nn nn in prp$ nn I got a wedding gift for my brother nik nire anaiari ezkontza opari bat erosi nion I-erg MY BROTHER-dat WEDDING GIFT a BUY-past prp prp$ nn nn nn vbd Modeling richer linguistic features: syntactic dependency
pobj S mod mod JJ NNS VBG IN NNP NNP translation and analysis of new data PLACE [ ] … . The urgent responseto ... [National laws ] applying in [Hong Kong ] JJ JJ NN NN mod pobj subj mod mod PLACE JJ NNS VBG IN NNP NNP [National laws ] applying in [Hong Kong ] New Statistical MT Models analysis projection [ ] [ ] IN NNP NNP VBG VBG JJ JJ JJ NNS NNS training of Hong In implementing national law(s) Kong PLACE mod subj mod
English-specific processing Minipar (Lin) Parallel text GIZA++ pobj subj mod mod mod JJ NNS VBG IN NNP NNP [National laws] applying in [Hong Kong] IN NNP VBG JJ NNS Collins parser Parser Training of Hong In implementing national law(s) Kong Source text mod pobj subj mod Language-specific processing Lexical selection Target text Linearization Baseline Dependency Transfer Architecture
Parser Gold Standard Development • Test set:188 English-Chinese sentence pairs from the Penn Chinese Treebank • Two bilingual annotators • Independent extraction of dependency triples • Precision/recall • Inter-annotator • Projected Chinese dependency trees • Trained parser dependency tree output
Research Targets • Rapid, automatic creation resources and tools • Models for effective use of noisy training data • Non-direct transfer: improving alignment models using lexical decomposition • Monolingual parsing performance versus effective transfer to English • Evaluation of dependency transfer on MT performance