550 likes | 638 Views
This Mecca We Call Kansas:. Internationalisation and Localisation of Library Systems Interfaces. Susan M. Johns 1999 CULS Conference. Pittsburg State University Axe Library Pittsburg KS USA suzyq@mail.pittstate.edu. February 6. April 12. April 23. May 5. May 17. July 10.
E N D
This Mecca We Call Kansas: Internationalisation and Localisation of Library Systems Interfaces
Susan M. Johns1999 CULS Conference Pittsburg State University Axe Library Pittsburg KS USA suzyq@mail.pittstate.edu
Cross-Cultural Communication • Develop user interfaces for products with a global market • When outsourcing to other countries, we work and communicate with people we have never met in person • Work culture values and views differ from our own
“Sundials perform as clocks in sunny climates -- they are more useful in Phoenix than in Boston and of no use at all during the Arctic winter.” Herbert Simon, The Sciences of the Artificial, MIT Press, 1981
The Tale of Three Interfaces (Nakakoji, 1996) • There are no generic cultural guidelines • Issues cannot be solved by using generalized characterizations of user populations, and ... • Unless representations are mathematical, there is always a risk of misunderstanding in human communication, and...
The Tale of Three Interfaces (Nakakoji, 1996) • Do users know what they want? • Do users recognize what they have designed (or requested)? • Is the user the best indicator the vendor has for developing the best design?
“Don’t boil the ocean.” Malcolm Frank, Be Quick or Be Dead, Software Magazine, March 1997
The International Need • Customers want systems that use their own language and meet their own cultural conventions • Some countries require products to reflect their culture and language • Internationally competitive companies must consider cultural preferences of their customers
PeopleSoft Goes Global • Global customers have more in common than differences • Vendor must understand what is different and what is similar • Everybody (vendors) is “Embarking”
What is Internationalisation • The process of providing a computer system that handles a variety of language, country, and cultural conventions
Internationalisation (I18N) • Eliminate cultural specifics • Design culture-independent user information and interfaces
What is Localisation • A locale is an operating system database of language and country conventions • Developing software to support multiple locales
Localisation (L10N) • Localisation of product for each user culture • Language, date and number formats • Graphical representations/icons • Color • Physical flow of objects
System I18N • Uses multilingual products instead of monolingual or bilingual products • Allows switching between different locales and languages • Provides software that meets international standards
System I18N Challenges • Treat English as just another language • Use one program source for all languages to reduce costs for maintenance and documentation
System I18N Challenges • Plan for extra disk space needed. To save space, ship only the languages purchased by a customer • What is the delay from when the package is available in the vendor’s local country to when it is available in other languages?
System I18N Challenges • Monitor acronyms and mnemonics for negative meanings in different languages • Understand differences among U.S., British, and global English • Be aware of different dialects in the same language
System I18N Challenges • Use care when sorting lists • Use numeric indexes instead of sorted alphabetic indexes whenever possible • Keep illustrations, tables, and figures simple • Verify translations back into English
Standards and the World of Uni- and Zed- • Unicode • UNIMARC • Z 39.50 • Z 39.69 • Z 39.70 • Zzzzz...
History of Unicode • ASCII, a “U.S.” Standard (ISO 646) • DBCS - double byte character system (some chars 1 byte, some 2 bytes) • Unicode - all chars 2 bytes (16 bits)
History of Unicode • Unicode is a subset of ISO 10646, as are ASCII and Latin-1 (8-bit ASCII) • Unicode eliminates duplicate Han characters in Chinese, Japanese and Korean (CJK) • ISO 10646 stores chars in 4 bytes; Unicode stores chars in 2 bytes
Unicode Problems • Universal standards for dates, measurements, and money • Simplified encoding of Chinese characters does not depict “classical” Chinese • Storage (twice as much?) • Transmissions (twice as long?)
UNIMARC Definition • implementation of ISO 2709 for the structure of records containing bibliographic data • intended to be a carrier format for exchange purposes • does not stipulate form, content, or record structure of data *within* individual systems
UNIMARC Problems • Software developers must rewrite their existing software • the existing MARC formats use a unique definition of extended ASCII • How do you convert 40 million MARC records without anyone noticing?
UNIMARC Benefits • Allows addition of foreign titles without transliterating the data • Users able to search library catalogs in all languages rather than just by call number or ISBN • Assumes software/virtual keyboards and other input devices needed to generate the CJK characters
Sorting and Conditional Formatting • English: A-Z, a-z • German: Characters with an umlaut sort directly after characters without an umlaut • Swedish: Ö sorts last in the alphabet after Z • Spanish: double characters (ll and ch) that sort as single characters
Other Issues • Upper and lower case, subtract 32 no more! • Wild card symbols in search/find boxes • Hyphenation of long words and word breaks • Gender in language • Tense and case
Message Catalogs • Files used to store program input and output strings • All program strings used interactively by the user should be contained in one or more message catalogs • Messages stored in database locales • Makes messages more customizable
Menu Space • 30-200% extra space depending on the number of English characters • Ex: “Preferences” translates “Bilschirmeinstellungen” • Boxes should be self-sizing and movable
Conventions and Format Differences • Dates: May 12, 1959 is • 12/5/59 5/12/59 1959-05-12 • Calendars: Gregorian, Hebrew, Islamic, Japanese Imperial Era • Times: 8:32 p.m. is • 20:32 20,32,00 20.32 KI 20.32
Conventions and Format Differences • Numbers: • 3,912.45 3.912,45 3 912,45 • Currency: • $2,456.78 2,456,78 DM 2.456$78 • Don’t forget £ and ¥ • Paper sizes: A3, A4, A5, JIS-B4 JIS-B5 • Punctuation : << >> ; ¡ ¿
Icons • Trashcan icon can look like a postal box in Britain • If you use books, make sure they open in the proper direction for the target market • Email icon of a rural post box with a red flag has no meaning outside rural America
Icons • Colors within icons may be culturally insensitive • Try not to use text: think in terms of international driving symbols • Think: what is the symbol for ISBN other than ISBN?
Formats for PatronsZ39.69 and Z39.70 • NISO standards for patron personal data and patron transaction data • I18N and L10N aspects of patron data need to be considered • Not limited to address, postal code, phone, ID, and confidentiality issues around the world
Serial Implications • Summer and Winter require different checkin patterns for Southern Hemisphere; where the volume starts • Vendor information needs to correctly identify currency and diacritically correct mailing information • Donations: how to begin to represent CJK subscriptions and show access?
Acquisition Implications • Currency - ability to pay an invoice with multiple types of currency depending both on publisher and on funding source • Currency - need more than two digits to the right of the decimal • Diacritically correct vendor names • Shipping addresses meeting multiple country postal format regulations
Acquisition Implications • Ability to assess VAT (UK) and GST (Australia), particularly at the voucher level • Exchange rate verified on a daily basis and indicated as such
Circulation Implications • Patron names diacritcally correct • Telecirc pronunciations phonetically correct • Ability to pay fines and fees in multiple currencies • Due dates/due times in multiple formats • Acceptance of ID digitized photos as part of patron record
Cataloging Implications • UNIMARC accessibility in all indexes • Ability to edit all diacritics with keyboard or pen input (CJK) • Ability to load multiple UNIMARC formats with minimal impact on profiles • Ability to retro diacritics back into records which no longer have them
Cataloging Implications • Subject terms are not just “alternatives”, but equivalents (i.e., Railroads is US-speak for Railways) • Frames of reference regarding name formats (the English student will use J.I.M. Stewart to find Michael Innes; the detective will go the other way round)
Cataloging Implications • AACR3? • Off-standard data which has the option to remain separate from standardised databases like OCLC, ABN, etc. • Global mapping from cross references would allow local choice of headings over incoming catalogue copy from non-local sources
Cataloging Implications • There is more than one National Library in the World