280 likes | 294 Views
Translation memory systems to enhance the quality and productivity of localization of teaching materials. Sam Joachim 6 th Workshop Software Engineering Education and Reverse Engineering, Ravda (Nessebar), Bulgaria, 18 – 23 September 2006.
E N D
Translation memory systems to enhance the quality and productivity of localization of teaching materials Sam Joachim 6th Workshop Software Engineering Education and Reverse Engineering, Ravda (Nessebar), Bulgaria, 18 – 23 September 2006
Last years suggestion for future development of S-Bahn Tool • Transformation of our PowerPoint material in an independent XML format (perhaps based on <ML>3) • With support for different learning paths • With the possibility of easily exchanging parts of the content (case studies, examples, …) • With different outputs (ppt, pdf, html, textbook …) • For different target groups, perhaps with diverse knowledge levels • Building of a kind of repository / pool system for the learning objects and an authoring system for adaptable study packs or (e-)learning material • Support the use of ‘Authoring by Aggregation’ in the system Translation Memory Systems
Authoring by aggregation: main idea Learning object A New learning object Learning object B • A new learning object is composed of some parts from LO A (a picture), LO B (some text) and 2 new modules Na and Nb Modul Na Information Objects / Modules Modul Nb Translation Memory Systems
Some problems to solve • Learning object models adapted to ‚Authoring by aggregation‘ • XML representation of our slides / material • S-Bahn tool problems with respect to localization (quality, efficiency of translation) Translation Memory Systems
Learning object models adapted to ‚Authoring by aggregation’&XML representation of our slides / material 6 Translation Memory Systems
o e s Shapes, Pictures, Textfields, Diagrams e e e t s n t y s Grouped elements, single or associated slides n s w w y g Section, Topic Slides, Section JCSE Course n Learnativity Content Model (Duval & Hodgins 2003) Modular Content Hierarchy Source: E. Duval and W. Hodgins, A LOM research agenda. In Proceedings of the twelfth international conference on World Wide Web, pages 1–9. ACM Press, 2003. 7 Translation Memory Systems
Raw media object / textfield Raw media object / textfield Raw media object / textfield Modular content objects explained on a typical slide Learning Object / Group of elements Information object / Remark for the audience Translation Memory Systems
Generation of fine-granular objects • Use / adaptation of an existing XML teaching material language • Automatic generation of • Raw media objects (shapes, textfields) • Information objects (groups of objects, graphics) • Semi-automatic by selecting from the automatically generated elements • Learning objects (associated slides, ..) • Higher level objects like • Aggregate assemblies (topics) • Collections (whole JCSE) Translation Memory Systems
Example for generated objects Waterfall model Translation Memory Systems
Schematic of a transformation process: .doc Ahyco .pdf Automated. Some AI? Interactivity. Some NI .ppt Essence Half-baked essence Final document Moodle Tool 2 Tool 3 Tool 4 Tool Something eLesson What is the essence? Text? Pictures? Style? XML? . . . . . . Something else Teaching / Learning Object or Material Repository Interactiv „Authoring by Aggregation“ process uses fragments/modules to generate new material Automatic generation of “Raw media objects“ Automatic grouping of connected objects Raw data and media elements in XML format Objects in some Teaching Material Language (LMML /<ML3>) knowledge grid Information objects (groups of objects, graphics)
binary ppt representation Two ways for representing ppt in xml to preserve our ppt layout embedding original ppt code in container elements in the xml-language <learning object> <container ppt> </container ppt> </learning object> <learning object> <slide Heading=‚The classical waterfall model (1970)‘> <graphic name=‚Waterfall Model‘> <textfield pos=(10,40) … > Analysis and Definition </textfield> </graphic> </slide> </learning object> build a transformation from the elements from ppt to the elements in the xml-language and vice versa (easier modification of extisting material) Translation Memory Systems
Translation Memory Systems Translation Memory Systems
S-Bahn Tool problems • Possible solution: • The use of a translation memory system. • Word by word translation • Repeated translation of phrases • Difficult use of the feature for the content slides / slides with reappearing content Translation Memory Systems
TM-Systeme: Introduction • two main types of translation support • Machine Translation (fully automatic) • Try to translate autonomous, in most cases nonsatisfying • Machine Assisted Human Translation (computer-aided) • Mainly with Translation Memory Systems Translation Memory Systems
Background - History • Who: Target group - professional translators • Where: professional translation agencies / specialized companies • What: Big translation projects with lots of different media and/or documents and dicument versions • Documents to translate: • Technical / project / product documentation • Texts with technical /natural science background • Software in different language variants and during different program versions • Including GUI elements, user / product documentation Translation Memory Systems
TM System • Is a data base • Records sentences or word groups (segments) with the corresponding translation • During translation, the TM-System searches the already translated segments for similarities with the actual segment • The translator can easily use already translated segments Translation Memory Systems
TM System (cont.) • Every sentence / segment will be checked during translating of other texts • If it is already in the data base, it is possible to adopt the translation as it stands • No segment /sentence should be translatet two times. (Similar to the S-Bahn Tool content slides feature, but much more generic in practice.) • Highly effective TM Systems ease routine jobs of the translator. He/she can concentrate more on the creative task of translating. Translation Memory Systems
Fuzzy searches • In realworld texts, a sentence or seqment is very seldom exactly repeated. • The most TM systems search not only for exact matches but also for matches with a certain similarity, the so called Fuzzy Matches • Fuzzy Matches: only marginal differences, e.g. numbers , names, additional words… • Of course, the translator has to do a manual adaptation when taking over a fuzzy match • Example from JCSE Topic 3 Slide 26: Already known: Part of the phase ‚Analysis and Design‘, in which the basic use cases of the systems will be detected: use case diagrams New: Part of the phase ‚Analysis and Design‘, in which the basic classes of the problem will be detected: class diagrams Translation Memory Systems
Existing TM Systems • Trados • http://www.trados.com • SDLX • http://www.sdlint.com • Transit • http://www.star-ag.ch/products/transit • DéjàVu • http://www.atril.com • OmegaT • http://www.omegat.org/ • Transolution • http://transolution.python-hosting.com/ Translation Memory Systems
GUI Translation Memory DB Terminology / Glossary DB Alignment Tool Parts of the TM System • Dictionary functions and sometimes glossary • Collects and provide technical terms during translation • Can easily changed and expanded • Quick online search • with sentences / segments and their translations Inserts already translated texts Translation Memory Systems
Alignment Tool: Example GPL Translation Memory Systems
Typical user interface • Two windows for source and target language • Fuzzy index with matches • Terminology dictionary Translation Memory Systems
TM System: Transolution Text to translate Suggested translation (from fuzzy match) to be adapted by the translator Fuzzy match: explanation of the suggested translation Display of the context of the text to translate Translation Memory Systems
Text to translate embedded in its context Suggested translation (from fuzzy match) and Glossary TM System: OmegaT Translation Memory Systems
TM System: Transit Fuzzy match: explanation of the suggested translation Text to translate Suggested translation (from fuzzy match) to be adapted by the translator Terminology Translation Memory Systems
Possible User Interface Terminology/ Glossary Seach: Functions Translation: Fuzzy match: (45%) Specification of the structure of the software (50%) specification of components and their relations Spezifikation der Struktur der Softwa Spezifikation der Komponenten und ihrer Beziehungen Terminology: Specification of components; relation Other languages Macedonian: Специфицирање на структурата на софтверот (софтверска архитектура), специфицирање на компонентите и нивните односи Romanian: Specificarea structurii SW (arhitectura software ), specificarea componentelor şi a relaţiilor între ele Serbian Cyrillic: Спецификација структуре софтвера (софтверска архитектура), спецификација компоненти и нјихових веза
Use of a TM system for localization of teaching materials • Applicabe especially for PPT materials because • High number of reappearing text segments • Not too much complete sentences, more phrases, segments • Higher support for translators higher productivity • Not restricted to PPT, open for any other document format • Possibility for a TM system web service for Software Engineering materials • Nessesary adaption: not only two languages but many Translation Memory Systems
Thank You. Space for Questions