160 likes | 272 Views
Training a Parser for Machine Translation Reordering. Jason Katz-Brown, Slav Petrov , Ryan McDonald, Franz Och David Talbot, Hiroshi Ichikawa, Masakazu Seno, Hideto Kazawa. Dependency Parsing. Given a sentence, label the dependencies (from nltk.org )
E N D
Training a Parser for Machine Translation Reordering Jason Katz-Brown, Slav Petrov, Ryan McDonald, Franz Och David Talbot, Hiroshi Ichikawa, Masakazu Seno, HidetoKazawa
Dependency Parsing • Given a sentence, label the dependencies • (from nltk.org) • Output is useful for downstream tasks like machine translation • Also of interest to NLP reaserchers
Overview of Paper • Motivation • Targeted Self Training Algorithm • MT experiments • Domain adaptation
Motivation - Evaluation • Intrinsic • How well does system replicate gold annotations? • Precision/recall/F1, accuracy, BLEU, ROUGE, etc. • Extrinsic • How useful is system for some downstream task? • High performance on one doesn’t necessarily mean high performance on the other • Can be hard to evaluate extrinsically
Motivation • Parsing is not a stand-alone task • Useful as part of a larger system • High-fidelity replication of gold parses won’t necessarily yield the best downstream performance • Try to train a model that will yield better downstream performance than a model trained to replicate gold standard • Maximize extrinsic quality, rather than intrinsic
Targeted Self Training Algorithm • For each sentence in a corpus • Parse sentence S with a baseline parser, get k-best • Choose the parse of S that optimizes some function F, add to training data • Retrain parser • F measures the extrinsic quality of the parse • Finding F can be challenging! • Standard self training: just choose 1-best
Reordering • Reordering is changing source language word order to target language word order • Here doing English (SVO) to Japanese (SOV) • Metrics that account for word order correlate better with human judgment than those that prefer word choice • Can use manually or automatically derived tree transforms to reorder • Reordering is useful as a preprocessing step
Reordering • Reordering is its own step • Function to evaluate reordering quality, given gold reordering: 1 – ((# chunks – 1) / (# words – 1)) • Chunks are contiguous spans in both predicted and gold • Prediction: A B E C D; Gold: A B C D E • 1 – ((3 -1 ) / (5 – 1))
Parsing and Reordering • Different parses yield different reorderings • Systems tend to be sensitive to errors
MT Experiment Setup • Train baseline Nivre dependency parser on WSJ (and Berkeley parser) • English/Japanese corpus with literal translations and manual word alignments • 6,268 training / 7,327 test sentences • Annotators need very little training • Makes this relatively cheap • Annotating dependency parses requires a lot of training
MT Experiment Setup • Use hand-crafted rules for reordering • Phrase-based MT system • Train parser in 3 ways: • Baseline • Standard self-training • Targeted self-training • Look at: • Labeled attachment score (LAS; intrinsic) • Reordering score • MT quality (BLEU and human)
Results • Evaluated MT quality with BLEU and humans • Varied the training of the dependency parser that feeds into reordering component • Experiments in Korean, Japanese and Turkish (all SOV languages) • In all cases BLEU and human opinion improves with targeted self training (10x) compared to baseline parser • Humans still put the translation quality in the “some meaning/grammar” range (~2.5/6) • Improvement is not drastic
Domain Adaptation Experiment • Use Question Treebank (QTB) to make MT system translate questions better than baseline system • Have 2k questions parsed • Have 2k questions translated and annotated for reordering • Compare translation output from system that includes parsers trained in different ways
Results • BLEU score and human opinion of Japanese translations of QTB test sentences was higher with targeted self training than with baseline parser • Gold QTB yielded better reordering score, but more expensive to produce than alignments • Didn’t report BLEU/human opinion on resulting translations