1 / 6

Multi-Engine Machine Translation

Multi-Engine Machine Translation. Stephan Vogel, Alon Lavie, Jaime Carbonell Bob Frederking, Ralf Brown Language Technologies Institute Carnegie Mellon University 5000 Forbes Av. Pittsburgh, PA 15213 Email: {vogel+, alavie, jgc}@cs.cmu.edu. MEMT Scenario.

hank
Download Presentation

Multi-Engine Machine Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Engine Machine Translation Stephan Vogel, Alon Lavie, Jaime Carbonell Bob Frederking, Ralf Brown Language Technologies Institute Carnegie Mellon University 5000 Forbes Av. Pittsburgh, PA 15213 Email: {vogel+, alavie, jgc}@cs.cmu.edu

  2. MEMT Scenario • Text from different knowledge sources • Translated by different translation systems (MT) • Selection of best translation; Creation of better translation • Distributed to different NLP systems or analysts NE Trl SUM. NE Tagger MT 2 TDT OCR MEMT MT 3 TCluster XLIR MT 4 MT 5 ITIC MT Integration Meeting

  3. Objective and Approach • Provide improved translation quality by combining output from different translation systems • MEMT granularity • Sentence level: select 1 out of n translations based on translation model (scoring scheme) • Sub-sentential level: find better translation by combining translation fragments (build translation lattice; single-best search) • Baseline: rescore all translations with same statistical translation model, use SMT decoder • Extensions: • Take scores assigned by different MT engines into account • Classification methods for scoring translation fragments • Confidence measures • Optimization according to MT metric ITIC MT Integration Meeting

  4. The more information the better Input/Output • Input: Translations from multiple systems • Translation lattice: partial translations annotated with scores • N-best list annotated with scores • First-best or only translation with/without scores • Output: Single best translation (N-best list possible) • Format: Text format as simple as possible • String • Lattice BEGIN_LATTICE0 1 “source word” “target words” total_score score_1 score_2 …0 3 “source words” “target words” total_score score_1 score_2 … … END_LATTICE • Segmentation issues: same input for all MT systems ITIC MT Integration Meeting

  5. Progress Tracking • Compare performance of MEMT output to performance of individual MT systems • Automatic (objective) MT evaluation metrics • General: Bleu and modified Bleu metrics (METEOR, NIST MTeval) • Specific: NE evaluation • Oracle: how well can an “ideal” MEMT system hope to perform? Find best-scoring translation in the lattice (according to the automatic metrics) • Human evaluation • To ascertain that automatic evaluation metrics are measuring the right thing ITIC MT Integration Meeting

  6. Hindi-to-English translation 3 MTs: Transfer-based, example-based, statistical Recent Results: Baseline System ITIC MT Integration Meeting

More Related