240 likes | 356 Views
The translation of examples, citations, definitions and glosses in the Papillon project. PAPILLON-02 international seminar, NII, Tokyo, 16-18 July 2002. Christian Boitet GETA, CLIPS, IMAG, CNRS, INPG & UJF Grenoble, France Christian.Boitet@imag.fr. Outline. The problem:
E N D
The translation of examples, citations, definitions and glosses in the Papillon project PAPILLON-02 international seminar, NII, Tokyo, 16-18 July 2002 Christian Boitet GETA, CLIPS, IMAG, CNRS, INPG & UJF Grenoble, France Christian.Boitet@imag.fr Translation in Papillon (Ch. Boitet)
Outline • The problem: • given the “pivot” architecture of monolingual dictionaries • translate all “free language elements” into all languages • & store the results, respecting the overall structure • Proposed solutions: • Storing: use auxiliary lexies and axies • Translating-1: shared tools for human translation • Translating-2: partial MT using UNL • Perspectives Translation in Papillon (Ch. Boitet)
Internal architecture of the database Interlingual Dictionary Japanese Dictionary Acception 343 UNL: card(icl>play) French Dictionary カード Vocable Carten.f. Lexie carte à jouer Lexie carte géographique 地図 Acception 345 UNL: map(fld>geography) Architecture derived from Gilles Sérasset’s Ph.D. Thesis Translation in Papillon (Ch. Boitet)
French DiCo Japanese DiCo Interlingual links Vocable carten.f. lexie carte.1 carte à jouer lexie carte.2 carte géographique カード Acception 343 UNL: card(icl>play),card(icl>thing)… 地図 EnglishDiCo Acception 345 UNL: map(fld>geography) Vocable cardN lexie card.1playing card lexie card.2 money card ThaiDiCo Acception 1002 UNL: card(fld>money) a Vocable=lexie map PAPILLON scenario & diagram • Interlingual links motivated by translations = "AXIEs" • Possibilitity to link 1 lexie to >1 acception • Links to other representations: AXIE—1——n—>UW Translation in Papillon (Ch. Boitet)
A monolingual DiCo entry (again) • Name of the lexical unit: MEURTRE • Grammatical properties: nom, masc • Semantic Formula: action de tuer: ~ PAR L'individu X DE L'individu Y • Government pattern:X = I = de N, A-poss Y = II = de N, A-poss • (Quasi-)synonyms: {QSyn} assassinat, homicide#1; crime • Semantic derivations & collocations: • {V0} tuer • {A0} meurtrier-adj / *Nom pour X*/ • {S1} auteur [de ART Ø] //meurtrier-n /*Nom pour Y*/ • {S2} victime [de ART Ø] /*Très choquant*/ • Examples: La mésentente pourrait être le mobile du meurtre. • Full Idioms: • appel au meurtre • crier au meurtre Structure derived from Alain Polguère’s work on DiCo Translation in Papillon (Ch. Boitet)
Fixed and free language elements • Fixed • Stereotyped definition in semantic formula: action de tuer: • Logical argument frame: ~ PAR L'individu X DE L'individu Y • Grammatical properties: nom, masc • Free • Examples: La mésentente pourrait être le mobile du meurtre. • Citations(e.g. for SPIRIT): the spirit is strong, but the flesh is weak (Bible, ref.XXX) • Free definitions in semantic formula (e.g. for a disease noun such as LEUCOCYTE): sort of cell contained in the blood and attacking infectious agents • Glosses (sometimes = quasi-synonyms): character (mood) Translation in Papillon (Ch. Boitet)
The problem (1) • Necessity to translate all free language elements • The translation in L2 of an example for X(L1) is not in general a good example • for the translation of Y in L2 • Il utilise souvent des cartes IGN*He often uses IGN roadmaps/maps • He often uses AA maps • IGN = Institut Géographique National • AA = Automobile Association • Hence, the size of the problem is quadratic! Translation in Papillon (Ch. Boitet)
The problem (2) • Where to store these translations? • Not in the lexies, which must remain monolingual • Not in the axies, which must remain pure links Translation in Papillon (Ch. Boitet)
Solution for the storing problem • Use auxiliary lexies and axies • terminology: x-lexie, x-axie • x {def, cit, ex, glo} • Each free language element becomes an x-lexie • cit-lexies and ex-lexies are simpler than normal lexies • X-lexies are linked through x-axies • An x-axie contains lists of x-lexiesand, in case of an external reference to UNL • a UNL graph (if x ≠ glo), or a UW (glo-axie) Translation in Papillon (Ch. Boitet)
Multilingual links = AXIES • Normal axies • for each language L, 0:n links to lexies of L • for each semantic system S available,0:n links to entities of SUNL UWs, WordNet synsets, NTT SemCat, Ontos concepts, LexiQuest Lex-concepts… • Auxiliary axies for examples, citations… • for each language L, 0:n links to lexies of L • if UNL-annotated, 1 UNL graph Translation in Papillon (Ch. Boitet)
A « Montaigne » environmemt for Human Translation • Idea: let users SHARE translation memory & tools on a server • Specs in 1995 around Eurolang Optimizer™ • no funding although « Francophony » interested… • Internet version: see www.yakushite.net (OKI) • First version built for Lao • see www.laosoftware.com (V. Berment) • Future: • bilingual editor as applet • use of Papillon server architecture (private spaces etc.) Translation in Papillon (Ch. Boitet)
Scenario & possible GUI Typical layout of a bilingual editor in a TSS Translation in Papillon (Ch. Boitet)
Design & implementation issues • Peer-to-peer architecture • PapillonMontaigne • Possibility to modify input text • & segmentation • Integrate with private lexicon(s) • Open to plug-ins (voice input…) Translation in Papillon (Ch. Boitet)
Automating translation using UNL • UNL = • a project • a language to represent NL utterance meanings • a format for multilingual documents (htmlxml) • Elements of the UNL language • UWs: headword(restrictions) book(icl>do) • Attributes: @future, @past, @complete…, @entry • Relations: agt, aoj, mod, obj, tim… • (Hyper)graphs: subgraph is connex & has an entry node Translation in Papillon (Ch. Boitet)
A simpleUNL input graph Translation in Papillon (Ch. Boitet)
Possible interactive disambiguation at analysis time Translation in Papillon (Ch. Boitet)
Interactive disambiguation (2) - gives a correct unique multilevel concrete (UMC) tree - then a correct unique multilevel abstract (UMA) tree - and finally a correct UNL graph Translation in Papillon (Ch. Boitet)
Possible text-graph « coedition » at reading time • applicable if there is a UNL graph associated with a segment one wants to modify • goal : share the revisions across languages, • by reflecting them on the UNL graph • Ex: FB2204 (Forum Barcelona 2004) • « Une cité retrouvera une zone côtière après unforum » • add ".@def" on the nodes for "city", ”forum” • transform “forum” into “Forum” • replace "retrieve" by "recover" • add ".@complete" on the node containing it. • « La cité récupérera une zone côtière après leForum » Translation in Papillon (Ch. Boitet)
Principles of coedition (1) • It is impossible in principle to deduce the modification on the graph from a modification on the text • For example, replacing "un" ("a") by "le" ("the") • does not entail that the following noun is determined (.@def), because it can also be generic • "il aime la montagne" = "he likes mountains" • Revision is not done by modifying directly the text, but by using a menu system • Menu items have a "language side" and a hidden "UNL side" Translation in Papillon (Ch. Boitet)
Principles of coedition (2) • when a menu item is chosen, • only the graph is transformed, • the action to be done on the text is delayed and shown • at any time, the new graph may be deconverted • If is is satisfactory, that shows that errors were due to the graph and not to the deconverter, and the graph may be sent to deconverters in other languages. • Versions in some other languages known by the user may be displayed, so that improvement sharing is visible and encouraging. Translation in Papillon (Ch. Boitet)
Conclusion • Need for a translation task (#7) in Papillon • Seamless integration with x-lexies + x-axies • Possible combination of TA & MT • Mutualization spirit (Papillon, Montaigne) for TA • Use of UNL (2 « pivot » architectures) for MT • Mutualization again in MT part (humans involved) • Interactive disambiguation • Coedition textUNL graph • … & of course lexical data contribution through Papillon! Translation in Papillon (Ch. Boitet)