1 / 37

Open-Source Approaches to Unicode Enablement

Open-Source Approaches to Unicode Enablement. Panel Discussion. Agenda. Panel Introductions Library Descriptions and Demos What is Open Source? What is the Open Source experience? Q and A. Arnt Gulbrandsen Bob Verbrugge Frank Tang Helena Shih Mark Leisher. Steven Loomis Steven Watt

jara
Download Presentation

Open-Source Approaches to Unicode Enablement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open-Source Approaches to Unicode Enablement Panel Discussion

  2. Agenda • Panel Introductions • Library Descriptions and Demos • What is Open Source? • What is the Open Source experience? • Q and A Amsterdam, the Netherlands, March 2000

  3. Arnt Gulbrandsen Bob Verbrugge Frank Tang Helena Shih Mark Leisher Steven Loomis Steven Watt Tex Texin Yves Arrouye Today’s Panel Amsterdam, the Netherlands, March 2000

  4. Amsterdam, the Netherlands, March 2000

  5. Amsterdam, the Netherlands, March 2000

  6. Library Descriptions and Demos • Troll: QT Free Edition • CRL: Assorted Unicode Support • Mozilla: International Library of Mozilla • IBM: International Components for Unicode Amsterdam, the Netherlands, March 2000

  7. Troll’s Qt Free Edition Arnt Gulbrandsen Troll Tech

  8. CRL’s Unicode Support Mark Leisher Computing Research Laboratory New Mexico State University

  9. CRL’s Unicode Support • Goal: Provide example resources usable on Unix. • Fonts. • Encoding mapping tables. • Unicode character information. • Algorithms. • Other resources. • Resource availability. Amsterdam, the Netherlands, March 2000

  10. CRL’s Unicode Support • Fonts. Three bitmap fonts in BDF format were developed and made available. • Arabic • Devanagari • Clearly U Amsterdam, the Netherlands, March 2000

  11. CRL’s Unicode Support • Encoding mapping tables. The Unicode Consortium provides mapping tables for converting many of the more common character sets to Unicode. The CSets archive provides supplementary mapping tables for character sets and encodings that are not supplied by the Unicode Consortium. Amsterdam, the Netherlands, March 2000

  12. CRL’s Unicode Support • Unicode character information. To facilitate development of Unicode-capable software, a simple character information and partial bi-directional reordering API and library was developed early on before standardization efforts really gained momentum. This is the UCData package and the Pretty Good Bidi Algorithm. Amsterdam, the Netherlands, March 2000

  13. CRL’s Unicode Support • Algorithms. To further encourage independent development of Unicode capable software, a few basic text search algorithms were converted to use Unicode text. These include: • A Boyer-Moore string search routine. • A glob matching routine called Wildmat. • An almost minimal DFA regular expression routine. Amsterdam, the Netherlands, March 2000

  14. CRL’s Unicode Support • Other resources. Some of the other resources made available by CRL are: • Code to test wchar_t type support in C/C++ compilers. • Keyboard arrangements for various languages that have been collected over the years. • Resource Availability. All of the resources mentioned are freeware and can be found at http://crl.nmsu.edu/~mleisher/. Amsterdam, the Netherlands, March 2000

  15. International Library for Mozilla Frank Tang Netscape Communications Mozilla

  16. International Components for Unicode (ICU) Helena Shih and Steven Loomis IBM Unicode Technology Center

  17. Unicode support in the Industry • Lack of a complete set of features in most implementations. • Inconsistent across different environments. Win32 vs. POSIX, for example. • Poor portability. • Unable to share the resources with other products. • Almost no extensibility and customization. • Not a concern for most companies when a product is first designed. Amsterdam, the Netherlands, March 2000

  18. Netfinity Server ICU Apple G3 Macintosh ICU IBM’s DB/2 Product AS/400 e-Server 720 Microsoft NT Workstation World Wide Web Sun Ultra 60 Workstation S/390 Server Amsterdam, the Netherlands, March 2000

  19. ICU Objectives • Quality Unicode & I18N support across platforms • Consistent results in both C/C++ and Java • Powerful, portable APIavailable to the Open-Source development community • Important resources sharing mechanism • Outside feedback & contributions improve quality and feature set Amsterdam, the Netherlands, March 2000

  20. ICU Features • Parallel to the i18n architecture in JDK • All components multi-thread safe • Full Unicode string manipulation • Complete locale support, e.g. > 145 locales • Fast and flexible character set conversion • Efficient data loading mechanism • Hierarchical resource bundles with Unicode data • Extensive calendar and timezone support • Date, time, currency, number and message formatting Amsterdam, the Netherlands, March 2000

  21. ICU Features • Locale sensitive sorting (including Thai) • Locale sensitive text boundary detection • Customizable transliteration interface • Unicode text compression algorithm • Fast and compliant Unicode 3.0 Bidi algorithm • Unicode 3.0 normalization support • Most up-to-date Unicode 3.0 character properties Amsterdam, the Netherlands, March 2000

  22. Platform Support • Reference Platforms: • AIX • OS/390 • AS/400 • RedHat Linux • Solaris • Windows 98, NT4.0 and Win2000 • HP-UX • Working Partners: Sun, IBM, NCR, Xerox, Netscape, Progress, RealNames, Versant, Compuware, GlobalSight, Hotmail, Lotus ... Amsterdam, the Netherlands, March 2000

  23. ICU Documentation • API Documentation • Updated from header files (like javadoc) • Available on external web site • User Guide • Work in progress, feedback welcome • Initial draft available Amsterdam, the Netherlands, March 2000

  24. ICU4J - ICU for Java • IBM developed extensive I18N library • I18N code added to Java JDK 1.1 • Java code ported to C++ -> ICU • ICU available on alphaWorks • Both ICU and Java classes continue development • Sometimes “leapfrogging” each other with features • ICU open source, moves to developerWorks • 2000 March: Java Code open source as “ICU4J” Amsterdam, the Netherlands, March 2000

  25. ICU4J Features • Builds on Java 2 feature set • Feature summary: • Advanced text boundary detection • Calendars: Hebrew, Hijri/Islamic, Japanese Gengou, Thai Buddhist • Spelled-out numbers • Normalization • Transliteration • Standard Unicode compression Amsterdam, the Netherlands, March 2000

  26. Reference Information • ICU Web Sites • http://oss.software.ibm.com/icu/ • developerWorks Unicode site • http://www.ibm.com/developer/unicode/ • The Unicode Standard • http://www.unicode.org/ • developerWorks Java site • http://www.ibm.com/developer/java/ Amsterdam, the Netherlands, March 2000

  27. Demos • Locale Explorer • xliterate-It! • Qt Demo Amsterdam, the Netherlands, March 2000

  28. Agenda • Panel Introductions • Library Descriptions and Demos • What is Open Source? • What is the Open Source experience? • Q and A Amsterdam, the Netherlands, March 2000

  29. ICU OpenSource Objectives • Promotes a cross-platform Unicode strategy • Produces a Unicode technology implementation • Supports important OpenSource productsLinux, Apache, Mozilla, XML etc. Amsterdam, the Netherlands, March 2000

  30. Open-Source Models • The Apache model • Web access for CVS repository • Technical committees • Developer community support • icu4c@us.ibm.com support account • news.alphaworks.ibm.com discussion newsgroup • Commercial product partnership • RealNames, versant, GE ... Amsterdam, the Netherlands, March 2000

  31. Open-Source Models • The Troll Tech model • Free and Professional Editions • Distinguish private, open source use from commercial, closed source use • All contributions accepted and used in both versions. • Source updated daily Amsterdam, the Netherlands, March 2000

  32. Why contribute to Open Source? • Bob Verbrugge: • Requires robust I18n and portability • Implementing alone, cost is considerable • Sharing development is cost effective • Shared knowledge with experts • Ability to influence the end-result Amsterdam, the Netherlands, March 2000

  33. Why contribute to Open Source? • Steve Watt: • Requires portability and interoperability • Upgrading existing library to Unicode version 3.0 is a sizable effort • Commercial libraries did not meet our needs • Shared effort means our development focus is now aligned with on our needs Amsterdam, the Netherlands, March 2000

  34. Why contribute to Open Source? • Steve Watt’s concerns: • Giving away proprietary technology • Design by committee • Will release schedules fit product schedules? • Will library and product stay in synch? • Do all participants have common objectives? Amsterdam, the Netherlands, March 2000

  35. Why contribute to Open Source? • Yves Arrouye: • Share expertise, give something • Benefits from features developed by others • Normalization, optimized algorithms • Character set conversions • Access to source code • Using multiple Open Source products Amsterdam, the Netherlands, March 2000

  36. Why contribute to Open Source? • Yves Arrouye’s concerns: • Management Perceptions “If it’s free, it must be for play…” • Entry requirements and qualifications to be able to affect direction or design • Patch integration, Release control and schedules • Build stability Amsterdam, the Netherlands, March 2000

  37. Agenda • Panel Introductions • Library Descriptions and Demos • What is Open Source? • What is the Open Source experience? • Q and A Amsterdam, the Netherlands, March 2000

More Related