650 likes | 802 Views
Motivations for transfer-based translation. lexical ambiguity structural differences See further Ingo 91. Example 1. Sv. Fyll på olja i växellådan. En. Fill gearbox with oil. (from the Scania corpus) fyll på fill obj adv adv obj. Example 2.
E N D
Motivations for transfer-based translation • lexical ambiguity • structural differences See further Ingo 91
Example 1 Sv. Fyll på olja i växellådan. En. Fill gearbox with oil. (from the Scania corpus) • fyll på fill • obj adv • adv obj
Example 2 Sv. I oljefilterhållaren sitter en överströmningsventil. En. The oil filter retainer has an overflow valve. (from the Scania corpus) • sitter has • adv subj • subj obj
Transfer-based translation • intermediary sentence structure • basic processes • analysis • transfer • generation (synthesis) • language modules • dictionary and grammar of SL • transfer dictionary and transfer rules • dictionary and grammar of TL
Direct translation SL TL Metal Transfer Multra Interlingua
Levels of intermediary structure • cf. J&M, Chapter 21 • word order
Metal • See H&S
MULTRA Multilingual Support for Translation and Writing • translation engine • transfer-based • shake-and-bake • modular • unification-based • preference machinery • trace-able
Analysis • chart parser (Lisp C) • procedural formalism • unification and other kinds of operations • sentence structure • feature structure • grammatical relations • surface order implicit via grammatical relations See further Sågvall Hein&Starbäck (99),Weijnitz (02), Dahllöf (89)
Transfer • unification-based • declarative formalism • Multra transfer formalism (Beskow 93) • lexical and structural rules • rules are partially ordered • a more specific rule takes precedence over a less specific one • specificity in terms of number of transfer equations • all applicable rules are applied • written in prolog
Generation • syntactic generation • Multra syntactic generation formalism (Beskow 97a) • PATR-like style • unification • concatenation • typed features • morphological generation (Beskow 97b) • lexical insertion rules • morphological realisation and phonological finish in prolog • written in prolog
An example: Tippa hytten. Tippa hytten. : (* = (PHR.CAT = CL MODE = IMP SUBJ = 2ND VERB = (WORD.CAT = VERB INFF = IMP DIAT = ACT LEX = TIPPA.VB.1 VSURF = +) OBJ.DIR = (PHR.CAT = NP NUMB = SING GENDER = UTR CASE = BASIC DEF = DEF HEAD = (LEX = HYTT.NN.1 WORD.CAT = NOUN))) REG = (V1.LEM = TIPPA.VB) SEP = (WORD.CAT = SEP LEX = STOP.SR.0)))
Transfer structure Transfer structure [VERB : [WORD.CAT : VERB LEX : TILT.VB.0 DIAT : ACT INFF : IMP] OBJ.DIR : [PHR.CAT : NP DEF : DEF NUMB : SING HEAD : [WORD.CAT : NOUN LEX : CAB.NN.0]] MODE : IMP SUBJ: 2ND VSURF: + SEP : [WORD.CAT : SEP LEX : STOP.SR.0] PHR.CAT : CL]
Generation Tilt the cab.
A grammar rule defrule legal.obj { <?1 phr.cat> = 'np, not <?1 case> = 'gen, not <?1 case> = 'subj }
Transfer rules • copy feature • delete feature • transfer feature • assign feature
Copy feature LABEL mode SOURCE <* mode> = ?x1 TARGET <* mode> = ?x2 TRANSFER
Delete feature LABEL REG SOURCE <* REG> = ANY TARGET <*> = <*> TRANSFER
Transfer feature LABEL OBJ.DIR SOURCE <* OBJ.DIR> = ?x1 TARGET <* OBJ.DIR> = ?x2 TRANSFER ?x1 <=> ?x2
Define feature LABEL trycka.in-press SOURCE <* lex sym>=trycka.vb+in.ab.1 <* word.cat>=VERB TARGET <* lex>=press.vb.1 <* word.cat>=VERB TRANSFER
A generation rule LABEL CL.IMP X1 ---> X2 X3 X4 : <X1 PHR.CAT> = CL <X1 VERB> = <X2> <X1 TYPE> = IMP <X1 OBJ.DIR> = <X3> <X1 SEP> = <X4>
A contextual lexical rule LABEL tänka.på-think.about SOURCE <* verb lex sym> = tänka.vb.1 <* obj.prep phr.cat> = pp <* obj.prep prep> = ?prep <* obj.prep prep lex sym> = på.pp.1 <* obj.prep rect> = ?rect1 TARGET <* obj.prep phr.cat> = pp <* obj.prep prep word.cat> = PREP <* obj.prep prep lex> = about.pp.1 <* obj.prep rect> = ?rect2 TRANSFER ?rect1<=>?rect2
A generation trace 1-Applying Rule cl-sep 1- Applying Rule cl.imp 1- Applying Rule subj2nd-verb-obj.dir 1- Applying Rule verb.main.act 1- Applying Rule np.the-df 1- Applying Rule ng.noun-def 1-Success!
Language resources in the MATS system • dictionary in a database with different views • analysis grammar • transfer grammar • incl. contextually defined lexical rules • generation grammar
The MATS system Frozen demo…
Assignment 2: Working with MATS http://stp.ling.uu.se/~evapet/mt04/assignment2.html
Lexicalistic translation • Identify (lexical) translation units in the source sentence • Translate each unit separately (considering the context) • Order the result in agreement with a model of the target language Formulation due to Lars Ahrenberg; see further AH (reading list) ; see also Beaven, L. John, Shake-and-Bake Machine Translation. Coling –92, Nantes, 23-28 Aout 1992.
T4F – a lexicalistic system • processes in T4F • tokenisation • tagging • transfer • transposition • filtering See further AH (in the reading list)
Interlingua translation • See SN
Applications of alignment • translation memories • translation dictionaries • lexicalistic translation • statistical machine translation • example-based translation
Translation memories • based on sentence links • optionally, sub sentence links See further Macklovitch, E. (2000)
Translation dictionaries • based on word links • refinement of word links
Refinement of word alignment data • neutralise capital letters where appropriate • lemmatise or tag source and target units • identify ambiguities • search for criteria to resolve them • identify partial links • compounds? • remove or complete them • manual revision?
Informally about statistical MT • build a translation dictionary based on word alignment • aim for as big fragments as possible • keep information on link frequency • build an n-gram model of the target language • implement a direct translation strategy • including alternatives ordered by length and frequency • process the output by the n-gram model filtering out the best alternatives and adjust the translation accordingly
Example-based MT HS (in the reading list)