610 likes | 619 Views
Explore phrase-based translation, log-linear models, and automatic tuning for decoding in statistical machine translation. Learn about language models, complexity management, recombination, pruning, and future cost estimation.
E N D
Statistical Machine TranslationPart III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23 EMA Summer School
Outline • Phrase-based translation • Log-linear model • Tuning log-linear model • Decoding
Language Model • Usually a trigramlanguage model isusedfor p(e) • P(the man wenthome) = p(the | START) p(man | START the) p(went | the man) p(home | man went) • Language modelswork well forcomparingthegrammaticalityofstringsofthesame length • However, whencomparingshortstringswithlongstringstheyfavorshortstrings • Forthisreason, a veryimportantcomponentofthelanguage model isthelengthbonus • Thisis a constant > 1 multipliedforeach English word in thehypothesis
d Modified from Koehn 2008
Outline • Phrase-based translation • Log-linear model • Tuning log-linear model • Decoding
Outline • Phrase-based translation model • Log-linear model • Tuning log-linear model automatically • Decoding
Outline • Phrase-basedtranslation model • Log-linear model • Tuning log-linear model automatically • Decoding • Basic phrase-baseddecoding • Dealingwithcomplexity • Recombination • Pruning • Future costestimation • Decoding output