560 likes | 780 Views
Applied XLIFF. Bryan Schnabel Content Management Architect, Tektronix, Chair, OASIS XLIFF TC @bryanschnabel. I’d like to know how well you know XLIFF . . . The three parts of applied XLIFF. Part 1: A quick tutorial on XLIFF 1.2 Part 2: XLIFF demonstrated with other open standards
E N D
Applied XLIFF Bryan Schnabel Content Management Architect, Tektronix, Chair, OASIS XLIFF TC @bryanschnabel
The three parts of applied XLIFF • Part 1: A quick tutorial on XLIFF 1.2 • Part 2: XLIFF demonstrated with other open standards • XLIFF and DITA • XLIFF and SVG • Part 3: XLIFF implemented and supported in tools • Examples of commercial and open source Localization (L10N) tools that support XLIFF • Examples of non-L10N tools that include XLIFF
XLIFF 1.2 • XML Localization Interchange File Format is the OASIS open standard for the exchange of Localization/Translation content • XLIFF 1.2 specification passed in February 2008 • Specification • XML Schema • Representation Guides
XLIFF features Standardized . . . • Processing translatable text • Leveraging Translation Memory (TM) • Leveraging glossaries • Controlling Segmentation • Managing L10n workflow • Managing word count • Interacting with CAT (Computer Assisted Translation), MT (Machine Translation), TMS (Translation Management System) applications
XLIFF translation model XLIFF supports the extract & merge paradigm, an established model for efficient translation Extract localizable text and format information Open XLIFF file with translation tool Source document XLIFF file Translate text Convert XLIFF file into original format Save translated text into XLIFF Translated document Bilingual XLIFF file
XLIFF translation model • Isolate the translatable text in translation units <trans-unit> <source>Hello</source> <target>Hello</target> <trans-unit> • Retain the source document’s structure • External to the XLIFF file, via a skeleton file - or - • Internally with <group> elements
But first, why DITA? • DITA is the open standard for Topic-based authoring/publishing • Can be nimble at the topic level (single-source, re-use, etc.) • Can offer more precise information to our customers (i.e., web), just-in-time • Breaks us out of the document-centric constraint • Supported by tools; best practices exist
But first, why XLIFF? • XLIFF is the open standard for Translation file interchange • Supported by tools (translation tools, TMS) • Easily understood and processed by LSPs • Encapsulates 10, 100, 1000 topics into a single file/workflow (supported by standard) • Standardized/supported automation with key translation elements (TM, glossaries, word-count, segmentation, etc.)
Problem: lots of DITA files can be tedious to process, 1 by 1
Translating can be difficult or translating can be easy • The Paradox • DITA’s strength its ability to harness many topics for a variety of outputs • DITA’s difficulty for the LSP is its many files • Strategy: • Take advantage of DITA’s map file in order to manage the many topic files • Create an XSLT to read the map file, and compile each of the referenced files to a single XLIFF file
DITA/XLIFF Roundtrip Open Source DITA OT Plugin Demonstration
Choosing the right version of the XLIFF tool Use my DITA OT Plugin http://sourceforge.net/projects/ditaxliff/files/ Do not use my other tool, the document-centric xliffRoundTrip Tool http://sourceforge.net/projects/xliffroundtrip/
A quick sketch of the DITA OT Skip ahead to SVG
DITA-XLIFF DITA OT Step 1 Tell the DITA OT to create an XLIFF file by pointing at the root map file
Result of Step 1 You will receive an XLIFF file with all topics & maps, plus a PDF file
DITA-XLIFF DITA OT Step 2 Translate the XLIFF file into the target language
DITA-XLIFF DITA OT Step 3 Tell the DITA OT to transform the translated XLIFF into a DITA project
Result of Step 3 You will receive a translated DITA project (maps, topics, hierarchy) Skip ahead to SVG
The Classic Problem: Translating Text in Graphics Technical publishers and LSPs have managed to translate text in graphics. But it has been problematic, expensive, and error prone.
Traditional Method 1 • Manually extract strings from each graphic • Provide the text strings to the translator in a word processor, or spreadsheet file • Translator translates text strings • Publisher pastes translated strings into graphic file
Drawbacks of the Copy and Paste Method • This method adds a lot of overhead for the project manager • Requires extensive involvement of the translator, graphic artist, and technical writer • To try to save money and time, sometimes groups attempt to use the copy and paste method with fewer participants. This often leads to errors and restarts.
Mishap 4. Wish for a correct paste, like this 1. Start with English source 2. Copy/Paste into spreadsheet 3. Supply Japanese translation 5. But operator misses the テ
But first, why SVG? • SVG is an XML language for describing two-dimensional graphics and graphical applications. • SVG files can be easily rendered, created, and edited in several commercial and open source software packages • SVG files are XML; that means the text strings are easily identified, extracted, and transformed.
SVG text strings are accessible as <tspan> elements <text transform="matrix(1 0 0 1 249 29.3076)"> <tspan x="0" y="0" font-family="'Myriad-Roman'" font-size="12">Head</tspan></text> <text transform="matrix(1 0 0 1 243 116.3076)"> <tspan x="0" y="0" font-family="'Myriad-Roman'" font-size="12">Neck</tspan></text> <text transform="matrix(1 0 0 1 243 227.8076)"> <tspan x="0" y="0" font-family="'Myriad-Roman'" font-size="12">Body</tspan></text> <text transform="matrix(1 0 0 1 0 92.3076)"> <tspan x="0" y="0" font-family="'Myriad-Roman'" font-size="12">Electronics consists of</tspan> <tspan x="0" y="14.4" font-family="'Myriad-Roman'" font-size="12">three single-coil </tspan> <tspan x="0" y="28.8" font-family="'Myriad-Roman'" font-size="12">pick-ups, which can</tspan> <tspan x="0" y="43.2" font-family="'Myriad-Roman'" font-size="12">be selected over a</tspan> <tspan x="0" y="57.6" font-family="'Myriad-Roman'" font-size="12">rocker switch.</tspan> </text>
Extract the text strings into XLIFF <trans-unit> elements <trans-unit id="d0e445" restype="x-tspan" xmrk:locate_id="0"> <source xml:lang="en">Head</source> <target state="needs-translation" xml:lang="DE">Head</target></trans-unit> <trans-unit id="d0e451" restype="x-tspan" xmrk:locate_id="1"> <source xml:lang="en">Neck</source> <target state="needs-translation" xml:lang="DE">Neck</target></trans-unit> <trans-unit id="d0e457" restype="x-tspan" xmrk:locate_id="2"> <source xml:lang="en">Body</source> <target state="needs-translation" xml:lang="DE">Body</target></trans-unit> <trans-unit id="d0e463" restype="x-tspan" xmrk:locate_id="3"> <source xml:lang="en">Electronics consists</source> <target state="needs-translation" xml:lang="DE">Electronics consists</target></trans-unit> <trans-unit id="d0e466" restype="x-tspan" xmrk:locate_id="4"> <source xml:lang="en">three single-coil</source> <target state="needs-translation" xml:lang="DE">three single-coil</target></trans-unit> <trans-unit id="d0e469" restype="x-tspan" xmrk:locate_id="5"> <source xml:lang="en">pick-ups, which can</source> <target state="needs-translation" xml:lang="DE">pick-ups, which can</target></trans-unit> <trans-unit id="d0e472" restype="x-tspan" xmrk:locate_id="6"> <source xml:lang="en">be selected over a</source> <target state="needs-translation" xml:lang="DE">be selected over a</target></trans-unit> <trans-unit id="d0e475" restype="x-tspan" xmrk:locate_id="7"> <source xml:lang="en">rocker switch.</source> <target state="needs-translation" xml:lang="DE">rocker switch.</target></trans-unit>
Run XLIFF against the existing TMX file to use already translated strings <tu segtype="block" tuid="g7"> <tuv xml:lang="en-US"> <seg>Neck</seg> </tuv> <tuv xml:lang="de-DE"> <seg>Hals</seg> </tuv> </tu> <tu segtype="block" tuid="g8"> <tuv xml:lang="en-US"> <seg>Head</seg> </tuv> <tuv xml:lang="de-DE"> <seg>Kopf</seg> </tuv> </tu> <tu segtype="block" tuid="g9"> <tuv xml:lang="en-US"> <seg>Body</seg> </tuv> <tuv xml:lang="de-DE"> <seg>Korpus</seg> </tuv> </tu> <tu segtype="sentence" tuid="g22"> <tuv xml:lang="en-US"> <seg>Electronics consists three single-coil pick-ups, which can be selected over a rocker switch.</seg> </tuv> <tuv xml:lang="de-DE"> <seg>Die Elektronik besteht aus drei Single-Coil-Tonabnehmern, die über einen Kippschalter angewählt werden können</seg> </tuv> </tu>
Create the translated XLIFF file <trans-unit id="d0e445" restype="x-tspan" xmrk:locate_id="0"> <source xml:lang="en">Head</source> <target state="needs-review-translation" xml:lang="DE">Kopf</target></trans-unit> <trans-unit id="d0e451" restype="x-tspan" xmrk:locate_id="1"> <source xml:lang="en">Neck</source> <target state="needs-review-translation" xml:lang="DE">Hals</target></trans-unit> <trans-unit id="d0e457" restype="x-tspan" xmrk:locate_id="2"> <source xml:lang="en">Body</source> <target state="needs-review-translation" xml:lang="DE">Korpus</target></trans-unit> <trans-unit id="d0e463" restype="x-tspan" xmrk:locate_id="3"> <source xml:lang="en">Electronics consists</source> <target state="needs-review-translation" xml:lang="DE">Die Elektronik besteht aus</target></trans-unit> <trans-unit id="d0e466" restype="x-tspan" xmrk:locate_id="4"> <source xml:lang="en">three single-coil</source> <target state="needs-review-translation" xml:lang="DE">drei Single-Coil-Tonabnehmern, </target></trans-unit> <trans-unit id="d0e469" restype="x-tspan" xmrk:locate_id="5"> <source xml:lang="en">pick-ups, which can</source> <target state="needs-review-translation" xml:lang="DE"> die über einen</target></trans-unit> <trans-unit id="d0e472" restype="x-tspan" xmrk:locate_id="6"> <source xml:lang="en">be selected over a</source> <target state="needs-review-translation" xml:lang="DE"> Kippschalter angewählt</target></trans-unit> <trans-unit id="d0e475" restype="x-tspan" xmrk:locate_id="7"> <source xml:lang="en">rocker switch.</source> <target state="needs-review-translation" xml:lang="DE"> werden können</target></trans-unit>
Then transform the translated XLIFF file to a translated SVG file
Advantage of grouping SVG files into a single XLIFF file <xliff> <file id="strat"> . . . </file> <file id="tele"> . . . </file> <file id="j-bass"> . . . </file> </xliff>
XLIFF with L10N and TMS tools • These are some examples of tools with which I am familiar • There are many more – I do not intend to endorse the following tools any more so, or less so, than the tools I do not mention