1 / 76

Internationalization

Internationalization. Contents. Internationalization Language rendering Rendering objects Cultural issues. Globalization. The process of worldwide economic, political, technological, and social integration

Download Presentation

Internationalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internationalization

  2. Contents • Internationalization • Language rendering • Rendering objects • Cultural issues

  3. Globalization • The process of worldwide economic, political, technological, and social integration • The process of making the necessary technical, managerial, personnel, marketing and other enterprise decisions to support localization

  4. Internationalization • The adaptation of products for use throughout the globe • Supporting multiple languages • Supporting multiple character sets • Supporting different formats for numbers, dates, currency, etc. • Printing on different paper sizes

  5. Localization • The addition of special features to allow a product to be used in a specific locale • Local language support • Local currency support • Local cultural concerns • Local symbols • Order of sorting

  6. The Need for I18N • 8-10% of the worlds population uses English as its primary language • Even in the USA large fractions of the population use other languages • Miami – 78% • Los Angeles – 45% • San Francisco – 42% • New York City – 25% of subway riders speak no English

  7. The Need for I18N • US based Palm Computing has a 68% share of the Latin American market • Personal computer suppliers • USA – 38.8% • Europe – 25% • Asia – 12% • Huge amounts of sales are now for the international markets

  8. Contents • Internationalization • Language rendering • Rendering objects • Cultural issues

  9. Language Rendering • The rendering of language is one of the most visible aspects of software • One common misconception is that each country has one language • Often one country will have several languages and one language will be spoken in several countries

  10. Language Rendering • For example • Canada – French & English • Belgium – Dutch & French • Switzerland – Italian, French, & German • In addition, there are regional differences in the languages • Eg. U.S. English differs in spelling and some words from British English

  11. Language Rendering • As a result, language is usually specified by both the country and the language • EN.US – American English • FR.CA – Canadian French • Sometimes flags are used to indicate which language to select • This is a poor choice due to the lack of a one-to-one relationship between countries and languages

  12. Rendering Japanese • Japanese illustrates most of the problems rendering languages • Japanese text is a mixture of 3 scripts which can be combined in one sentence • This results in a huge number of characters to render • This is still simpler that countries with multiple dialects and scripts

  13. Japanese Scripts • Kanji • The complete written language with 50,000 characters • Kana • Symbols representing sounds, broken into two groups • Hiragana – native Japanese sounds • Katakana – represents foreign words other than Chinese and Korean • Romanji – Letters from the Roman alphabet to use for untranslated foreign words

  14. Character Sets • Different character sets are used to render the characters in all the scripts • A character set is the encoding of a series of letters and other symbols into numeric codes • Since there are so many scripts, there are a number of character sets as well

  15. ASCII Character Set • American Standard Code for Information Interchange • Represents 128 characters in 7 bits with a parity bit • Extended ASCII uses all 8 bits for 256 chars • Has control characters, Roman alphabet, punctuation and some special characters • Supports English, Swahili, and Hawaiian

  16. ISO 8859 Character Sets • This is a family of 8 bit character sets used to represent European languages • This is the default character set for use on the web • It supports several language specific groupings • ISO 8859-1 – western Europe (French, Italian, German, Spanish…)

  17. ISO 8859 Character Sets • ISO 8859-2 – Central/Eastern Europe (Hungarian, Polish,…) • ISO 8859-3 – Southern Europe (Esperanto, Maltese) • ISO 8859-4 – Northern Europe (Estonian, Latvian,…) • ISO 8859-5 – Cyrillic • ISO 8859-6 – Arabic • ISO 8859-7 – Greek • ISO 8859-8 – Hebrew • ISO 8859-9 – Turkish • ISO 8859-10 – Nordic • ISO 8859-11 – Thai • ISO 8859-12 – unused • ISO 8859-13 – Baltic Rim • ISO 8859-14 – Celtic

  18. EBCDIC • Extended Binary Coded Decimal Interchange Code • An 8 bit encoding • Used mainly on IBM machines and some Fujitsu, Unisys, and HP • EBCDIC is incompatible with ASCII and must be translated • Some characters are not translated exactly!!

  19. Code Page • A code page is simply the mapping of character to numeric values • This is the same as a character set • The EBCDIC code page is shown here

  20. Windows Character Sets • Windows has their own character sets which are slightly different from the ISO character sets • Windows 1250 – Central European • Windows 1251 – Cyrillic • Windows 1252 – Western Languages • Windows 1253 – Greek • Windows 1254 – Turkish • Windows 1255 – Hebrew • Windows 1256 – Arabic • Windows 1257 – Baltic • Windows 1258 – Vietnamese

  21. Unicode • This is a modern character set which incorporates all other existing character sets • So far, it represents 100,000 characters • It has various encodings which can be used depending on the amount of storage available • Standard encoding is 16 bit, representing 65,536 characters

  22. Unicode • UTF-8 • Represents all universal characters in 1 – 4 bytes • Backwards compatible with ASCII which is represented as 1 byte • Prefixes code on the first byte indicate which encoding is being used • More compact than using 16 bit or 32 bit encodings

  23. Unicode • The use of Unicode simplifies many problems with internationalization • Unicode is supported by many modern software platforms • Java • XML • Modern Operating Systems

  24. Fonts • A character set connects an abstract description of a character with a code • A font connects a glyph with a code • There can be several fonts for each character set allowing for different shapes and styles of symbols • Different languages have fonts with different shapes and requirements

  25. Font Recommendations • Provide enough space between lines for different ascenders and descenders • Do not use ornate fonts that will obscure accents used by some languages • Choose fonts that support required accents • Do not assume that any particular font will be available on the target platform • Many non-Latin languages require proportional spacing (ie. Arabic)

  26. Text Direction • There are 3 types of directionality • Left-to-right • European languages, Thai • Left-to-right/vertical • Chinese, Japanese, Korean • Bidirectional • Hebrew & Arabic

  27. Bidirectional Text • Although the text is right-to-left, embedded numbers and other languages are right-to-left • If the number “123-4567” is embedded • As a phone number it is read left-to-right • As a subtraction it means 4567 minus 123 • Bidirectional text is right justified

  28. Paper Size • Most of the world uses metric paper sizes based on the ISO 216 standard

  29. Hardware Concerns • Hardware around the world is different • Make sure the keyboard can enter the needed character sets • Make sure the printer can handle the paper sizes required • Make sure the printer and display can handle the needed character sets

  30. Translation Issues • Accurate translation is critical to internationalization • Hire good translators • Provide them with the text to translate • Provide a glossary defining each of the words to translate so that they understand the intended meaning

  31. Contents • Internationalization • Language rendering • Rendering objects • Cultural issues

  32. Sorting • Different countries have different rules for sorting • Be able to handle accents • Watch for letter sequences • In Spanish, “cho” comes after “co” • In many languages upper case is sorted after lower case • ISO/IEC 14651:2000 provides guidelines on international collation

  33. Date and Time Formats • ISO 8601 specifies a format for international representation of time and date • Most locales prefer to use their own format • Your software should be able to format in the local manner • This information is usually available as a locale in your programming language (Java, C, .NET)

  34. Date and Time Formats

  35. Addresses • Addresses have different formats throughout the world • In Mexico, • Name(paternal, maternal, first) • Street & number • Building, floor, suite • Colony • City, state • Postal code

  36. Telephone Numbers • Telephone Numbers also differ throughout the world • Different numbers of digits • Different separators • Different groupings

  37. Currency & Numbers • Currency & numbers are formatted differently in each country • Different currency symbols • Different location of +, - signs • Different thousands separators • Different definition of billion • Most of this information will come from the locale

  38. Contents • Internationalization • Language rendering • Rendering objects • Cultural issues

  39. Cultural User Interface Design • Culture has an influence on what people expect from an interface and how they interpret the interface • Geert Hofstede studied IBM employees in 53 countries and developed 5 dimensions on which cultures could be measured • We will now look at these 5 dimensions

  40. Power Distance • Measures how well a society accepts large or small differences in power in a social hierarchy • For example, whether an employee of a large organization has easy, informal access to the boss • Cultures with easy access to powerful figures are assigned a low power distance index

  41. Individualism vs. Collectivism • This measures whether a culture favours individual achievement or whether it favours group efforts • A high collectivist rating is assigned to cultures which favour group efforts over individual ones

  42. Masculinity vs. Femininity • Measures the degree to which a culture separates traditional gender roles • Traditional male role • Tough, task-oriented warriors • Traditional female role • Tender, gentle home makers • More male-oriented Cultures score higher

  43. Uncertainty Avoidance • Measures the degree to which a culture is uncomfortable with and tries to reduce uncertainty • Cultures which emphasize punctuality, formality, and explicit communication rate high in uncertainty avoidance

  44. Long Term Orientation • This is prevalent in cultures with Confucian thought, which emphasizes patience • Usually found in Asian cultures • Higher values indicate more long term orientation

  45. Hofstede’s Cultural Indexes

  46. Hofstede’s Cultural Indexes

  47. Hofstede’s Cultural Indexes

  48. Hofstede’s Cultural Indexes

  49. Designing for Culture • Studies were done to see what type of UI components were used in different cultures • The results can be used as a guideline for how to design user interfaces that meet cultural expectations

  50. Power Distance • Metaphors • High • Images of government or corporate institutions and buildings, schools, monuments, etc. • Low • Informal or popular institutions or buildings, Montessori schools, public parks, etc.

More Related