1 / 18

Mastering XML Content Localization & Unicode Representation

Explore XML internationalization, character encoding, language identification, content rendering, and more. Learn about XML Localization Interchange, Unicode deployment, and best practices for global content management.

hiltonj
Download Presentation

Mastering XML Content Localization & Unicode Representation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Content Localization and Unicode 21 IUC Dublin, Ireland, May 2002 Ultan Ó Broin (ultan.obroin@oracle.com) Globalization Analyst Oracle Applications Technology Group

  2. Agenda • XML Internationalization • Character set encoding and representation • Language identification • Content presentation and rendering • Language features • XML Localization • Defining the content • XML Localisation Interchange File Format • Summary

  3. XMLContent • “Content, Content, All is Content” (Apologies to Ecclesiastes 1:2 and Lady Margaret Thatcher) • Software user interface strings, help, documentation text, marketing collateral, procurement catalogs … • Any data stored in a database or content management system

  4. XML and Character Set Encoding • Any character set • Encoding Declaration • IANA values • Unicode for global deployment • UTF-8 Default • UTF-8 versus UTF-16 • UTF-16 requires0xFEFF/0xFFFEByte Order Mark

  5. XML Character Representation • Unicode character representation • Numeric character references • TM =&#x2122 • Warning about character entities • TM =™ • Use Unicode normalized characters

  6. Language Expression • Language identifier <xml:lang> attribute • Declaration <!ATTLIST pxml:lang NMTOKEN> • Language and country values • Warning about multilingual documents

  7. Presentation and Rendering • eXtensible Style Sheet Language (XSL-FO) • Cascading Style Sheet (CSS) • International presentation • Fonts • Quotation marks • Lists • Eliminate conflict with Unicode markup • UTR#20 “Unicode in XML and other Markup Languages”

  8. Presentation and Rendering • Bi-directional language support

  9. Presentation and Rendering • Language support • Ruby text <h3>Example Ruby text (albeit in English)</h3> <p> <ruby> <rb>This is the Base Language Text Position</rb> <rt>This is the Ruby Language Text Position</rt> </ruby> </p>

  10. Presentation and Rendering • Vertical writing writing-mode properties: <p style="writing-mode: tb;">Example of vertical text</p> • Combined text XSL and CSS text-combine properties: span.kumimoji { text-combine: letters; }span.warichu { text-combine: lines; } • White space delimiters xml:space element • Emphasis font-emphasis-style and font-emphasis-position properties • Different browsers and operating systems

  11. Presentation and Rendering • Sorting • <xsl:sort/>element • Ascending and descending order • langattribute • Caution • Numbers • <xsl:number/> • Date and Time • Locale independent • XML/ISO Schema dateand time of day values

  12. Localization of XML • Single content format : many media • Authors define data for Localization • Provide DTD or schema definition to the Localization Group

  13. Localization of XML Content • Define information • Localization-friendly element names • Persistent Identifier • Context • Expansion • Localization notes • Non-localizable element names and attributes

  14. Localization of XML Content • XML Localisation Interchange File Format (XLIFF) • Oracle, Novell, IBM/Lotus, Sun Microsystems, Alchemy, Berlitz, LionBridge, Moravia-IT, and the RWS Group • Requires XML conversion (XSLT, other) • Open standard DTD • Designed for the localization process • Localization tools support • SDLx, Trados Tag Editor, Star Transit, Alchemy Catalyst, ForeignDesk or any tool that defines localizable XML elements

  15. XLIFF Example <header> <phase-group> <phase phase-name="translationedit" process-name="translation" date="2002-01-12T 12:11:21Z" /> </phase-group> </header>

  16. XLIFF Example <trans-unit id=”bigirishcolumn_145” restype=”title” maxwidth=”90” size-unit=”byte”> <source xml:lang=”EN”>Database manager</source><target xml:lang=”GA”>Feighlí feasa</target> <alt-trans> <target xml:lang=”GA”>Gocamán na ngiotán</target> </alt-trans> <note>The Term Manager means administration tool - not a person</note> </trans-unit>

  17. Summary XML and Unicode • Unicode for all content • Global storage • XLIFF for suppliers and vendors • One localization tool set • Globalization as a commodity • “XLIFF provides for the separation of content and process. It allows a focus on automation, stops a proliferation of internal XML formats, and turns localization into a commodity for all players. Software publishers focus on producing international products and vendors focus on localizing this content without managing multiple translation tools or file formats.”Paul Quigley, i18n consultant (paul_quigley_ie@hotmail.com)

  18. References • XML Specifications: http://www.w3.org/XML/ • XLIFF: http://www.oasis-open.org/ • Tools, templates and more: http://www.opentag.com • XML Internationalization and Localization by Yves Savourel (ISBN:0-672-32096-7, Jul-2001) • Localization Institute Seminars on XML i18n and l10n: http://www.localizationinstitute.com

More Related