1 / 5

TURKALATOR A Suite of Tools for English to Turkish MT

Explore the development of a groundbreaking English-to-Turkish machine translation system, aiming to overcome scarcity issues in Turkish inflectional morphology and improve translation quality. Customized translation strategies and advanced tools result in significant enhancements over traditional methods.

bmcdonough
Download Presentation

TURKALATOR A Suite of Tools for English to Turkish MT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TURKALATORA Suite of Tools for English to Turkish MT Siddharth Jonathan Gorkem Ozbek CS224n Final Project June 14, 2006

  2. English - Turkish MT • The challenge • Traditionally statistical MT research has focused on language pairs with rich resources • Ambitious goal – Complete English-to-Turkish MT system on par with those on the Web (Google, Systran, etc.) • Realistic goal – Outperform the general-purpose baseline • The focus • Address scarcity issues stemming from rich Turkish inflectional morphology • The strategy • Approximate a morphological analysis by exploiting certain aspects of Turkish morphology to get sub-lexical units • Customize translation model building heuristics to deal correctly with these units

  3. Baseline English to Turkish MT System Phrase building heuristics Word Aligned English-Turkish Phrase translation table GIZA++ (aligner) Sentence Aligned English-Turkish Pharaoh (decoder) Turkish Corpus (training set) Turkish Language Model English Sentences SRILM Corpus: Approx. 22,000 aligned sentence pairs covering several genres Turkish Translations

  4. The Turkalator Way… Turkish Text Phrase Translation table Segmentation Stem Alignment Phrase Extraction and Scoring General word Alignment English Text Pharaoh (decoder) Turkish Language Model

  5. Evaluation • Quantitative results • Qualitative results • Scarcity reduced greatly: many more Turkish words are now translated • An example: • English input: “She thought it over.” • Reference translation: “Julia bunu iyice düşündü.” • Baseline translation: “Başvuran düşünce bu over.” • Turkalator translation: “Julia onun üzerinde düşündü.”

More Related