290 likes | 481 Views
Postgraduate Diploma in Translation. Introduction to Machine Translation IV The Translator’s Workstation. Recap: MT Methods. MT Direct MT Rule-Based MT Data-Driven MT Transfer Interlingua EBMT SMT. Different Styles of MT. FAMT : fully automatic machine translation
E N D
Postgraduate Diploma in Translation Introduction to Machine Translation IV The Translator’s Workstation Intro to MT IV
Recap: MT Methods MT Direct MT Rule-Based MT Data-Driven MT Transfer Interlingua EBMT SMT Intro to MT IV
Different Styles of MT • FAMT: fully automatic machine translation • FAHQMT • FALQMT • MAHT: machine aided human translation • HAMT: human aided machine translation Intro to MT IV
The Proper Place ofMen and Machinesin Language Translation • Martin Kay, 1980 [1997] • Machine translation is an excellent research vehicle but stands no chance of filling actual needs for translators. • Answer is to develop cooperative man-machine systems • Start with word processing and add translation specific enhancements to approach the goal of automatic tranlation. • Be modest: be humble. Intro to MT IV
The Translator’s WorkstationOrigins & Development • Main idea of TW attributed to Martin Kay (author of “Proper Place of Men and Machines in Machine Translation”, (1980) • Basic ingredients include • Glossaries • Multilingual termbanks • Translation Memories (TM) • Built on word processing environment • Progressive automation of dictionary lookup and access to TM Intro to MT IV
Standard Word Processing Environment includes • Spell Check • Grammar Check • Thesaurus • Word Counting • Archiving and retrieval of documents Intro to MT IV
Translation-Oriented Editing • Basic Idea: add a certain level of linguistic awareness to editing functions. • Translation-oriented word substitution • e.g. replace “purchase” with “buy” • system: • purchasing → buying • purchased → bought • e.g. replace “brume” with “brouillard” • system: • brouilard épais → brume épaisse Intro to MT IV
Integration with Desktop PublishingTranslation of Captions Intro to MT IV
Mark Up Languages • Markup is anything added to the content of the document that describes the text. • Formatting instructions: typeface, fonts, paragraphs, bulletted lists. • HTML • More abstract levels of content description. • XML Intro to MT IV
TMX • TMX (Translation Memory eXchange) is the vendor-neutral open XML standard for the exchange of Translation Memory • The purpose of TMX is to allow easier exchange of translation memory data between tools and/or translation vendors • http://www.lisa.org/tmx/specification.html Intro to MT IV
Access to Lexical Resources • Online Dictionaries • On screen version of traditional printed dictionary • Exploitation of hypertext links • Editing facilities cf. French Assistant system from Lernhout and Hauspie • Term banks • Gazetteers • Encyclopaedic knowledge • World Wide Web Intro to MT IV
Commercially Available Systems • Typically designed for non-linguists • ... as an extension of a familiar word processing environment Intro to MT IV
A Typical MAHT • Separate windows for source and target text • Source text initially shown in target window, to be overwritten by translation • User highlights a portion of text to be machine translated. • Draft translation is then pasted in, ready for post-editing. • User decides what will be translated by machine, and can develop a modus operandi. Intro to MT IV
Interactive Translation • Most systems allow user a choice of interactive translation in which systems stops and asks translator to make choices. • Can be annoying. Machine may keep asking the same question. • Difficult to resolve this problem in general case. Intro to MT IV
Translation Memory • First proposed in 1970s, but not generally available until 1990s. • Database of previous translations • Sentence by sentence translation • If exact match for new sentence is found, it is pasted in. • If not, TM may highlight those parts of the new sentence which differ from the stored one. Intro to MT IV
Translation Memory –Higlighting Difference Intro to MT IV
Translation Memory • Keys to success are • Efficient storage of sentences • Efficient matching scheme • Most current commercial systems are based on character string similarity Intro to MT IV
Similarity between sentences • When the paper tray is empty, remove it and refill it with paper of the appropriate size • When the tray is empty, remove it and fill it with the appropriate paper. • When the bulb remains unlit, remove it and replace with a new bulb • You have to remove the paper tray in order to refill it when it is empty. Intro to MT IV
Other Corpus Based Resources • Concordance: is a list of words (called keywords, e.g. here ‘sin’), taken from a corpus displayed in the centre of the page and shown in contexts in which they occur • Monolingual • Bilingual • Other Corpus tools • Word sense profilers - WASPS Intro to MT IV
Monolingual Concordance Example 1 hed it off. * * * ‘What a curious feeling!’ said Alice; ‘I must b 1 against herself, for this curious child was very fond of pretendi 2 ‘Curiouser and curiouser!’ cried Alice ( 2 ‘Curiouser and curiouser!’ cried Alice (she was so muc 2 Eaglet, and several other curious creatures. Alice led the way, 4 -- and yet – it’s rather curious, you know, this sort of life! 6 eir heads. She felt very curious to know what it was all about, 6 out a cat! It’s the most curious thing I ever saw in my life!’ S 7 ht into it. ‘That's very curious!’ she thought. ‘But everything’ 7 hought. ‘But everything's curious today. I think I may as well g 8 Alice thought this a very curious thing, and she went nearer to w 8 she had never seen such a curious croquet-ground in her life; it 8 seen, when she noticed a curious appearance in the air: it puzz 9 next, and so on.’ ‘What a curious plan!’ exclaimed Alice. ‘That’s 10 : ‘and I do so like that curious song about the whiting!’ ‘Oh, 10 th, and said ‘That’s very curious.’ ‘It's all about as curious a 10 ous.’ ‘It’s all about as curious as it can be,’ said the Gryphon 11 moment Alice felt a very curious sensation, which puzzled her a 11 er the list, feeling very curious to see what the next witness wo 12 ad!’ ‘Oh, I’ve had such a curious dream!’ said Alice, and she tol 12 her, and said, ‘It was a curious dream, dear, certainly: but no Intro to MT IV
Bilingual Concordance Intro to MT IV
Bilingual Concordance Intro to MT IV
WASPS • A Semi-Automatic Lexicographer's Workbench for Writing Word Sense ProfileS • Adam Kilgarriff, David Tugwell et. al, ESRC 1999-2002 • Remit was to explore the synergy between the lexicographer's task of identifying and describing word senses, and the computational task of word sense disambiguation (WSD). Intro to MT IV
Summary • Translator’s workstation represents the most cost effective facility for the professional translator working in a large organisation. • Range of integrated services that are relevant to translation. • Translator remains in control Intro to MT IV