370 likes | 501 Views
Open-Source Approaches to Unicode Enablement. Panel Discussion. Agenda. Panel Introductions Library Descriptions and Demos What is Open Source? What is the Open Source experience? Q and A. Arnt Gulbrandsen Bob Verbrugge Frank Tang Helena Shih Mark Leisher. Steven Loomis Steven Watt
E N D
Open-Source Approaches to Unicode Enablement Panel Discussion
Agenda • Panel Introductions • Library Descriptions and Demos • What is Open Source? • What is the Open Source experience? • Q and A Amsterdam, the Netherlands, March 2000
Arnt Gulbrandsen Bob Verbrugge Frank Tang Helena Shih Mark Leisher Steven Loomis Steven Watt Tex Texin Yves Arrouye Today’s Panel Amsterdam, the Netherlands, March 2000
Library Descriptions and Demos • Troll: QT Free Edition • CRL: Assorted Unicode Support • Mozilla: International Library of Mozilla • IBM: International Components for Unicode Amsterdam, the Netherlands, March 2000
Troll’s Qt Free Edition Arnt Gulbrandsen Troll Tech
CRL’s Unicode Support Mark Leisher Computing Research Laboratory New Mexico State University
CRL’s Unicode Support • Goal: Provide example resources usable on Unix. • Fonts. • Encoding mapping tables. • Unicode character information. • Algorithms. • Other resources. • Resource availability. Amsterdam, the Netherlands, March 2000
CRL’s Unicode Support • Fonts. Three bitmap fonts in BDF format were developed and made available. • Arabic • Devanagari • Clearly U Amsterdam, the Netherlands, March 2000
CRL’s Unicode Support • Encoding mapping tables. The Unicode Consortium provides mapping tables for converting many of the more common character sets to Unicode. The CSets archive provides supplementary mapping tables for character sets and encodings that are not supplied by the Unicode Consortium. Amsterdam, the Netherlands, March 2000
CRL’s Unicode Support • Unicode character information. To facilitate development of Unicode-capable software, a simple character information and partial bi-directional reordering API and library was developed early on before standardization efforts really gained momentum. This is the UCData package and the Pretty Good Bidi Algorithm. Amsterdam, the Netherlands, March 2000
CRL’s Unicode Support • Algorithms. To further encourage independent development of Unicode capable software, a few basic text search algorithms were converted to use Unicode text. These include: • A Boyer-Moore string search routine. • A glob matching routine called Wildmat. • An almost minimal DFA regular expression routine. Amsterdam, the Netherlands, March 2000
CRL’s Unicode Support • Other resources. Some of the other resources made available by CRL are: • Code to test wchar_t type support in C/C++ compilers. • Keyboard arrangements for various languages that have been collected over the years. • Resource Availability. All of the resources mentioned are freeware and can be found at http://crl.nmsu.edu/~mleisher/. Amsterdam, the Netherlands, March 2000
International Library for Mozilla Frank Tang Netscape Communications Mozilla
International Components for Unicode (ICU) Helena Shih and Steven Loomis IBM Unicode Technology Center
Unicode support in the Industry • Lack of a complete set of features in most implementations. • Inconsistent across different environments. Win32 vs. POSIX, for example. • Poor portability. • Unable to share the resources with other products. • Almost no extensibility and customization. • Not a concern for most companies when a product is first designed. Amsterdam, the Netherlands, March 2000
Netfinity Server ICU Apple G3 Macintosh ICU IBM’s DB/2 Product AS/400 e-Server 720 Microsoft NT Workstation World Wide Web Sun Ultra 60 Workstation S/390 Server Amsterdam, the Netherlands, March 2000
ICU Objectives • Quality Unicode & I18N support across platforms • Consistent results in both C/C++ and Java • Powerful, portable APIavailable to the Open-Source development community • Important resources sharing mechanism • Outside feedback & contributions improve quality and feature set Amsterdam, the Netherlands, March 2000
ICU Features • Parallel to the i18n architecture in JDK • All components multi-thread safe • Full Unicode string manipulation • Complete locale support, e.g. > 145 locales • Fast and flexible character set conversion • Efficient data loading mechanism • Hierarchical resource bundles with Unicode data • Extensive calendar and timezone support • Date, time, currency, number and message formatting Amsterdam, the Netherlands, March 2000
ICU Features • Locale sensitive sorting (including Thai) • Locale sensitive text boundary detection • Customizable transliteration interface • Unicode text compression algorithm • Fast and compliant Unicode 3.0 Bidi algorithm • Unicode 3.0 normalization support • Most up-to-date Unicode 3.0 character properties Amsterdam, the Netherlands, March 2000
Platform Support • Reference Platforms: • AIX • OS/390 • AS/400 • RedHat Linux • Solaris • Windows 98, NT4.0 and Win2000 • HP-UX • Working Partners: Sun, IBM, NCR, Xerox, Netscape, Progress, RealNames, Versant, Compuware, GlobalSight, Hotmail, Lotus ... Amsterdam, the Netherlands, March 2000
ICU Documentation • API Documentation • Updated from header files (like javadoc) • Available on external web site • User Guide • Work in progress, feedback welcome • Initial draft available Amsterdam, the Netherlands, March 2000
ICU4J - ICU for Java • IBM developed extensive I18N library • I18N code added to Java JDK 1.1 • Java code ported to C++ -> ICU • ICU available on alphaWorks • Both ICU and Java classes continue development • Sometimes “leapfrogging” each other with features • ICU open source, moves to developerWorks • 2000 March: Java Code open source as “ICU4J” Amsterdam, the Netherlands, March 2000
ICU4J Features • Builds on Java 2 feature set • Feature summary: • Advanced text boundary detection • Calendars: Hebrew, Hijri/Islamic, Japanese Gengou, Thai Buddhist • Spelled-out numbers • Normalization • Transliteration • Standard Unicode compression Amsterdam, the Netherlands, March 2000
Reference Information • ICU Web Sites • http://oss.software.ibm.com/icu/ • developerWorks Unicode site • http://www.ibm.com/developer/unicode/ • The Unicode Standard • http://www.unicode.org/ • developerWorks Java site • http://www.ibm.com/developer/java/ Amsterdam, the Netherlands, March 2000
Demos • Locale Explorer • xliterate-It! • Qt Demo Amsterdam, the Netherlands, March 2000
Agenda • Panel Introductions • Library Descriptions and Demos • What is Open Source? • What is the Open Source experience? • Q and A Amsterdam, the Netherlands, March 2000
ICU OpenSource Objectives • Promotes a cross-platform Unicode strategy • Produces a Unicode technology implementation • Supports important OpenSource productsLinux, Apache, Mozilla, XML etc. Amsterdam, the Netherlands, March 2000
Open-Source Models • The Apache model • Web access for CVS repository • Technical committees • Developer community support • icu4c@us.ibm.com support account • news.alphaworks.ibm.com discussion newsgroup • Commercial product partnership • RealNames, versant, GE ... Amsterdam, the Netherlands, March 2000
Open-Source Models • The Troll Tech model • Free and Professional Editions • Distinguish private, open source use from commercial, closed source use • All contributions accepted and used in both versions. • Source updated daily Amsterdam, the Netherlands, March 2000
Why contribute to Open Source? • Bob Verbrugge: • Requires robust I18n and portability • Implementing alone, cost is considerable • Sharing development is cost effective • Shared knowledge with experts • Ability to influence the end-result Amsterdam, the Netherlands, March 2000
Why contribute to Open Source? • Steve Watt: • Requires portability and interoperability • Upgrading existing library to Unicode version 3.0 is a sizable effort • Commercial libraries did not meet our needs • Shared effort means our development focus is now aligned with on our needs Amsterdam, the Netherlands, March 2000
Why contribute to Open Source? • Steve Watt’s concerns: • Giving away proprietary technology • Design by committee • Will release schedules fit product schedules? • Will library and product stay in synch? • Do all participants have common objectives? Amsterdam, the Netherlands, March 2000
Why contribute to Open Source? • Yves Arrouye: • Share expertise, give something • Benefits from features developed by others • Normalization, optimized algorithms • Character set conversions • Access to source code • Using multiple Open Source products Amsterdam, the Netherlands, March 2000
Why contribute to Open Source? • Yves Arrouye’s concerns: • Management Perceptions “If it’s free, it must be for play…” • Entry requirements and qualifications to be able to affect direction or design • Patch integration, Release control and schedules • Build stability Amsterdam, the Netherlands, March 2000
Agenda • Panel Introductions • Library Descriptions and Demos • What is Open Source? • What is the Open Source experience? • Q and A Amsterdam, the Netherlands, March 2000