Multi-Engine Machine Translation for Enhanced Quality Outputs

MEMT:Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee

Goals and Approach • Combine the output of multiple MT engines into a synthetic output that outperforms the originals in translation quality • Synthetic combination of the originals, NOT selecting the best system • Experimented with two approaches: • Approach-1: Merging of Lattice outputs + joint decoding • Each MT system produces a lattice of translation fragments, indexed based on source word positions • Lattices are merged into a single common lattice • Statistical MT decoder selects a translation “path” through the lattice • Approach-2: Align best output from engines + new decoder • Each MT system produces a sentence translation output • Establish an explicit word matching between all words of the various MT engine outputs • “Decoding”: create a collection of synthetic combinations of the original strings based on matched words, target LM, and constraints + re-combination and pruning • Score resulting hypotheses and select a final output MEMT

Approach-2: Sentence MEMT • Idea: • Start with output sentences of the various MT engines • Explicitly align the words that are common between any pair of systems, and apply transitivity • Use the alignments as reinforcement and as indicators of possible locations for the words • Each engine has a “weight” that is used for the words that it contributes • Decoder searches for an optimal synthetic combination of words and phrases that optimizes a scoring function that combines the alignment weights and a LM score MEMT

The Sentence Matcher • Developed by Satanjeev Banerjee as a component in our METEOR Automatic MT Evaluation metric • Finds maximal alignment match with minimal “crossing branches” • Implementation: Clever search algorithm for best match using pruning of sub-optimal sub-solutions MEMT

Matcher Example IBM: the sri lankan prime minister criticizes head of the country's ISI: The President of the Sri Lankan Prime Minister Criticized the President of the Country CMU: Lankan Prime Minister criticizes her country MEMT

The MEMT Algorithm • Algorithm builds collections of partial hypotheses of increasing length • Partial hypotheses are extended by selecting the “next available” word from one of the original systems • Sentences are assumed synchronous: • Each word is either aligned with another word or is an alternative of another word • Extending a partial hypothesis with a word “pulls” and “uses” its aligned words with it, and marks its alternatives as “used” – “vectors” keep track of this • Partial hypotheses are scored and ranked • Pruning and re-combination • Hypothesis can end if any original system proposes an end of sentence as next word MEMT

The MEMT Algorithm • Scoring: • Alignment score based on reinforcement from alignments of the words • LM score based on trigram LM • Sum logs of alignment score and LM score (equivalent to product of probabilities) • Select best scoring hypothesis based on: • Total score (bias towards shorter hypotheses) • Average score per word MEMT

The MEMT Algorithm • Parameters: • “lingering word” horizon: how long is a word allowed to linger when words following it have already been used? • “lookahead” horizon: how far ahead can we look for an alternative for a word that is not aligned? • “POS matching”: limit search for an alternative to only words of the same POS MEMT

Example IBM: korea stands ready to allow visits to verify that it does not manufacture nuclear weapons 0.7407 ISI: North Korea Is Prepared to Allow Washington to Verify that It Does Not Make Nuclear Weapons 0.8007 CMU: North Korea prepared to allow Washington to the verification of that is to manufacture nuclear weapons 0.7668 Selected MEMT Sentence : north korea is prepared to allow washington to verify that it does not manufacture nuclear weapons . 0.8894 (-2.75135) MEMT

Example IBM: victims russians are one man and his wife and abusing their eight year old daughter plus a ( 11 and 7 years ) man and his wife and driver , egyptian nationality . : 0.6327 ISI: The victims were Russian man and his wife, daughter of the most from the age of eight years in addition to the young girls ) 11 7 years ( and a man and his wife and the bus driver Egyptian nationality. : 0.7054 CMU: the victims Cruz man who wife and daughter both critical of the eight years old addition to two Orient ( 11 ) 7 years ) woman , wife of bus drivers Egyptian nationality . : 0.5293 MEMT Sentence : Selected : the victims were russian man and his wife and daughter of the eight years from the age of a 11 and 7 years in addition to man and his wife and bus drivers egyptian nationality . 0.7647 -3.25376 Oracle : the victims were russian man and wife and his daughter of the eight years old from the age of a 11 and 7 years in addition to the man and his wife and bus drivers egyptian nationality young girls . 0.7964 -3.44128 MEMT

Example IBM: the sri lankan prime minister criticizes head of the country's : 0.8862 ISI: The President of the Sri Lankan Prime Minister Criticized the President of the Country : 0.8660 CMU: Lankan Prime Minister criticizes her country: 0.6615 MEMT Sentence : Selected: the sri lankan prime minister criticizes president of the country . 0.9353 -3.27483 Oracle: the sri lankan prime minister criticizes president of the country's . 0.9767 -3.75805 MEMT

Current System • Some features of decoding algorithm and final scoring still under experimentation • Initial development tests performed on TIDES 2003 Arabic-to-English MT data, using IBM, ISI and CMU SMT system output • Further development tests performed on Arabic-to-English EBMT Apptek and SYSTRAN system output and on three Chinese-to-English COTS systems • Integrated within CACI REFLEX Demonstration Platform MEMT

Experimental Results:Chinese-to-English MEMT

Experimental Results:Arabic-to-English MEMT

Other Examples http://www-2.cs.cmu.edu/afs/cs/user/alavie/Students/Shyam/Comps100 MEMT

Conclusions • New sentence-level MEMT approach with promising performance • Easy to run on both research and COTS systems • Tuning of parameter space for hypothesis generation – too tuned to METEOR? • Decoding is still suboptimal • Oracle scores show there is much room for improvement • Need for additional discriminant features • Some ideas currently under investigation MEMT

Approach-1: Lattice MEMT • Approach: • Multiple MT systems produce a lattice of output segments • Create a “union” lattice of the various systems • Decode the joint lattice and select best synthetic output MEMT

Approach-1: Lattice MEMT • Lattice Decoder from CMU’s SMT: • Lattice arcs are scored uniformly using word-to-word translation probabilities, regardless of which engine produced the arc • Decoder searches for path that optimizes combination of Translation Model score and Language Model score • Decoder can also reorder words or phrases (up to 4 positions ahead) MEMT

Initial Experiment: Hindi-to-English Systems • Put together a scenario with “miserly” data resources: • Elicited Data corpus: 17589 phrases • Cleaned portion (top 12%) of LDC dictionary: ~2725 Hindi words (23612 translation pairs) • Manually acquired resources during the DARPA SLE: • 500 manual bigram translations • 72 manually written phrase transfer rules • 105 manually written postposition rules • 48 manually written time expression rules • No additional parallel text!! MEMT

Initial Experiment: Hindi-to-English Systems • Tested on section of JHU provided data: 258 sentences with four reference translations • SMT system (stand-alone) • EBMT system (stand-alone) • XFER system (naïve decoding) • XFER system with “strong” decoder • No grammar rules (baseline) • Manually developed grammar rules • Automatically learned grammar rules • XFER+SMT with strong decoder (MEMT) MEMT

Results on JHU Test Set (very miserly training data) MEMT

Effect of Reordering in the Decoder MEMT

Further Experiments:Arabic-to-English Systems • Combined: • CMU’s SMT system • CMU’s EBMT system • UMD rule-based system • (IBM didn’t work out) • TM scores from CMU SMT system • Built large new English LM • Tested on TIDES 2003 Test set MEMT

Arabic-to-English SystemsLattice MEMT Results: MEMT

Lattice MEMT • Main Drawbacks: • Requires MT engines to provide lattice output  difficult to obtain! • Lattice output from all engines must be compatible: common indexing based on source word positions  difficult to standardize! • Common TM used for scoring edges may not work well for all engines • Decoding does not take into account any reinforcements from multiple engines proposing the same translation for any portion of the input MEMT

Demonstration MEMT

Experimental Results:Arabic-to-English MEMT

Multi-Engine Machine Translation for Enhanced Quality Outputs

Multi-Engine Machine Translation for Enhanced Quality Outputs

Presentation Transcript

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching

MEMT: Multi-Engine Machine Translation Guided by Explicit Word Matching

Multi-Engine Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation

Machine Translation