180 likes | 325 Views
Modern MT Systems and the Myth of Human Translation: Real World Status Quo. Intro MT & HT Definitions Comparison MT vs. HT Evaluation Methods FAE Framework Conclusion Discussion. Is This for Me?. (Freelance) translators and agencies Developers and vendors of MT systems
E N D
Modern MT Systems and theMyth of Human Translation: Real World Status Quo • Intro • MT & HT Definitions • Comparison MT vs. HT • Evaluation Methods • FAE Framework • Conclusion • Discussion
Is This for Me? • (Freelance) translators and agencies • Developers and vendors of MT systems • People concerned with MT evaluation • People concerned with HT evaluation This talk may be of benefit for: Not for interpreters and speech/non-text based issues
Introduction • What is Machine Translation (MT)? „MT is the automatic translation of human language by computers.“ • What is [Human] Translation (HT)? „The process of transforming text from one language into another language.“ „A written communication in a second language having the same meaning as the written communication in a first language.“
Introduction II • Is there such a thing as HT? „Pure Human Translation“ „Machine Aided Human Translation“ „Human Aided Machine Translation“ • Is HT equal to HT? „Native Speaker“ „Speaks Language X“ „[Trained] Professional“ „Trained Prof. specialized in X“
HT/MT Examples & Quizshow Original: Einzigartiger Freizeitpark für Groß und Klein T1: Singular recreational park for large and small T2: Unique leisure time park for largely and small T3: Ein Fantastische DinoPark ferrcoitung T4: Unique Freizeitpark at big and little T5: Unique amusement park for great and Klein T6: Unique leisure park for big and little T1: Babelfish/SYSTRAN T2: SDL FreeTranslation.com T3: Human T4: InterTran T5: Linguatex eTranslation T6: PetaMem LangSuite MT
Summary HT Quality • Not all HTs are equal • Significant amount done by untrained people • Better performance of good(!) MT systems on these examples suggests rising MT competitiveness
Judging Expensive Questionable results Using MT-eval methods: limitations just mentioned Issues with MT & HT Evaluation • Evaluation vs. Similarity • Ngram does work? Why? • Reference Translations: • Cost & Availability • Multiples – which • „Axiomatic Truth“
Mission Impossible? • Fully automatic evaluation method for both MT & HT – with no human Intervention? • Purpose: Automatic QA of translations – at least safe rejection of bad results • Part of an iterative process (with faith in the translator)
Monolingual Corpora for SL & TL Statistical reference Dictionaries & Thesauri Adequacy check Translation distance Sentence Alignment Parallel Corpora Translation Length Ratio Let's Try Anyway! Extract Information Reference Data • Text Metrics • Length • Word/Sentence/Paragraph count • Statistics • Character/Word occurrence • Ngram • Collocations • Translator Parameters
Conclusion • Translation results of the best contemporary MT systems can be considered on par with the average HT • The presented evaluation framework is just the beginning of an automatic evaluation method for both MT & HT • It is a robust and reliable validation method with safe rejection of invalid/bad translations • In production Q1/2005
Thanks! Q & A