200 likes | 215 Views
Learn about globalizing software for worldwide distribution. Understand localization obstacles, solutions, formats, and locales. Helpful tips on managing messages, numbers, dates, and more.
E N D
GLOBALISATION • I bet it is quite natural to dream about writing software that is being sold around the world… • However, there may be some small obstacles on the way to selling your software worldwide. • Today we study potential problems and solutions. • Terms:Localisation = adjusting software locallyGlobalisation, internationalisation = creating a software in such a way that it is eays to localise it to different countries. Software Engineering 2003 Jyrki Nummenmaa
History • As a little warm-up, consider the situation about 20 years ago. • At that time there was an increasing interest into creating software products, which could be sold to different customers within Finland. • For customising the software, it was important to put all user interface constants into a place, where they are easy to change. • Program code is not such a place -> parameter files or database are a much better choice. Software Engineering 2003 Jyrki Nummenmaa
Simple example • http://java.sun.com/docs/books/tutorial/i18n/index.html deals with internationalisation issues. • We have a quick look a their example. • In the example, multilingual texts are managed using- a locale, identified by a (country, language) pair- resource bundles, one per locale,- property files, where strings are identified bykeys • Strings are identified by keywords within a locale. Software Engineering 2003 Jyrki Nummenmaa
Messages Labels on GUI components Online help Sounds Colors Graphics Icons Dates Times Numbers Currencies Measurements Phone numbers Honorifics and personal titles Postal addresses Page layouts What should be globalised? • Our example in the previous slide only dealt with a simple message! • Labels can also be managed in a fairly straightforward manner, if enough space is reserved for them. • Now let’s have a look at the rest… Software Engineering 2003 Jyrki Nummenmaa
Locales • As we saw, globalisation in Java was based on the use of Locales. • A local is identified by a language (compulsory), country (optional) and variant (optional). • A class, whose behaviour is based on the use of a locale, is called locale-sensitive. • You can find locales available to a locale-sensitive class by using the getAvailableLocales() method. • There is also a default locale for a Java Virtual Machine, and it can be accessed by Locale.getDefault() • Different objects may use different locales. Software Engineering 2003 Jyrki Nummenmaa
Identify what needs to be managed through locales • As you think about locales, you will find out that you have - data items such as messages and sounds, which change altogether with the locale, and- data items, which remain the same, but whoseformatting changes, e.g. dates and numbers- possibly data items not to be localised (internal use, interface to another application, …). • Design the globalisation - identify which is which. • Arrange your data items into resource bundles (e.g. items for the same form in the same bundle, so that you will not need to load unnecessary objects). Software Engineering 2003 Jyrki Nummenmaa
Formats - numbers • Numbers are formatted differently in different countries, e.g.:345 987,246 – France345.987,246 – Germany 345,987.246 - US • Java includes a NumberFormat class that can be used to format numbers, currencies (no exchange rates, though :) and percentages • You can use the NumberFormat class to both create formatted strings and parse strings. • You can also provide your own patterns, if this is not enough for you… Software Engineering 2003 Jyrki Nummenmaa
Dates and Times • Similarly as with numbers, dates and time are represented differently. • Also similarly, there is a DateFormat class, which you can use to create standard date and time formats. • Here again, you may customise – and you may also define your own names for things such as weekdays etc. Software Engineering 2003 Jyrki Nummenmaa
Messages containing variable parts • Examples:- 405,390 people have visited your website since January 1, 1998. (1) - The <devicename> number <devicenumber> has been activated. (2) • Word order may change between languages, which may make it impossible to correctly translate message (1) assuming that it is the text between the number and the date. • In message (2) the word “activated” may require different translation in some languages (e.g. French) depending on the gender of the word for the device name. • Basic rule of thumb: If you can avoid messages containing these variable parts, then do so! Software Engineering 2003 Jyrki Nummenmaa
Class MessageFormat • With the MessageFormat class you can define a message template, which gives the message text and shows where to format the changing data and how. • With ChoiceFormat, you can choose between strings using based on a number you give as a parameter (this is particularly handy for managing plurals). Software Engineering 2003 Jyrki Nummenmaa
Characters • US Ascii – 7 bit • ISO 8859-X where X is some digit – an 8-bit system – if 8th bit is 0, then the first 7 bits represent a US Ascii character. • Windows 125x codepages – similar to ISO 8859-X, but not the same of course – typical Windows interoperability nightmare… • Unicode – meant to represent all characters from all languages. Needs more bits (usually done with 16) but there are several encoding schemes. Some, for instance, use two bytes (16 bits) for some characters and one byte (8 bits) for some… • http://www.unicode.org/index.html Software Engineering 2003 Jyrki Nummenmaa
Chinese and Japanese • Thousands of symbols. • Unicode can do – but you need more pixels on the screen as well. • In Japanese there are several writing systems. • Text input can be done as followed:1. The user types in the word in some phonetic writing system based on latin characters.2. The system shows the characters (there may be many) matching the phonetic writing.3. The user picks the right character. Software Engineering 2003 Jyrki Nummenmaa
Korean • In the Korean writing system (hangul), characters are composed from parts based on which character follows which. • There is a limited number of building blocks ie. character parts (can’t remember, but maybe around 25). Software Engineering 2003 Jyrki Nummenmaa
Writing order • Latin – left to right. • In Chinese and Japanese, traditional writing order is top-down, and columns left-to-right. • Nowadays adjusted to ordinary left-to-right. • In Arabic and Hebrew, the text itself is written from right-to-left, but all latin names (like yours, probably) are written left-to-right in the middle of right-to-left. Software Engineering 2003 Jyrki Nummenmaa
Character properties • Don’t do: if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')) // ch is a letter • In Java, char represents a Unicode character. • You can use class Character to check for things such as white space, digits, upper and lower case. • E.g.: Character.isDigit(ch), Character.isLetter(ch), Character.isLowerCase(ch) • You can also use .getType() and predefined constants to check things like: if (Character.getType('a') == Character.LOWERCASE_LETTER) Software Engineering 2003 Jyrki Nummenmaa
Comparing characters and strings • You can use the Collator class, e.g.: Collator myCollator = Collator.getInstance(); if( myCollator.compare("abc", "ABC") < 0 ) System.out.println("abc is less than ABC"); else System.out.println("abc is greater than or equal to ABC"); • getInstance() takes also a locale as a parameter. • You can customise the rules used in the comparisons. Software Engineering 2003 Jyrki Nummenmaa
Finding boundaries of words, sentences, etc. • The boundaries may, of course, be defined differently in different languages. • Initialise BreakIterator with one of these methods: - getCharacterInstance - getWordInstance - getSentenceInstance - getLineInstance • E.g. BreakIterator sentenceIterator = BreakIterator.getSentenceInstance(currentLocale); • One BreakIterator only works with one type of breaks. Software Engineering 2003 Jyrki Nummenmaa
Colors, gestures, other symbols • E.g. in far east there is a lot of symbolism in colors, names, numbers, etc. (e.g. red is a good color, 4 is a bad number, etc.) • Also, for instance hand gestures vary from one place to another – what is good here may be bad elsewhere. • Even in Europe there is variance. Consider tick marks: x (good here, bad in UK), √ (not exactly like this, however good in UK, bad here). Software Engineering 2003 Jyrki Nummenmaa
Higher cultural issues • General customs • How to do business • How to be polite • How to say no • How to avoid ”loosing face” in far east. • What to avoid in particular. • These issues may have impact on software as well. Software Engineering 2003 Jyrki Nummenmaa
Conclusions • The final conclusion is:”This is all quite complicated, and if you have to get deeper into these things, find someone who really knows.” • When you start writing your software, think a bit on the need of globalisation. • If you know that English (or Finnish) is sufficient, then it makes life easier. • If you know that globalisation is needed, you should start globalising when you start writing your software! • Java offers lots of resources. If you want to re-invent the wheel, this may not be the best place. Software Engineering 2003 Jyrki Nummenmaa