170 likes | 266 Views
International Domain Names facts and dilemmas. Elisabeth Porteneuve ICANN Accra, Ghana 9-14 March 2002. IDN - why we are there. Competition by gTLD registries towards new customers Natural need of people from the worldwide Internet to facilitate their access to the Internet domain names
E N D
International Domain Namesfacts and dilemmas Elisabeth Porteneuve ICANN Accra, Ghana 9-14 March 2002 ICANN Accra, Ghana, March 2002
IDN - why we are there • Competition by gTLD registries towards new customers • Natural need of people from the worldwide Internet to facilitate their access to the Internet domain names • General dissatisfaction of the worldwide Internet with ICANN and its incapability to became international body, triggering off a strog reactions from various horizons ICANN Accra, Ghana, March 2002
Background to IDN • ASCII subset « LDH » • Unicode • CJK languages and the Traditional Chinese vs Simplified Chinese issue • IETF works ICANN Accra, Ghana, March 2002
ASCII subset « LDH » • 26 letters, a-z, upper and lowercase alike • 10 digits, 0-9 • Hyphen-minus « - » • Label separating period « . » Additional rules: • no minus at the beggining or at the end of a label • no empty label, … ICANN Accra, Ghana, March 2002
1. ASCII subset « LDH » (cont) • Necessity to have something to communicate, small and easy to remember, practical • Postal address (cannot have all postmen in the world to speak all languages) • Airlines, airports and landing strips indications (security) • Lingua franca for international gatherings ICANN Accra, Ghana, March 2002
2. Unicode • The only existing table of all international characters, developed for printer’s industry in late 1980’s, at the time when computer memory and processing were slow and requested for a lot of ingenuity to allow new features • The origins of Unicode are rooted in works on unified Han ICANN Accra, Ghana, March 2002
2. Unicode – unified Han • Subset of Chinese, Japanese and Korean (CJK) characters, which • Have identical internal computer point • Print in Chinese, Japanese or Korean design, according to a language context • May have a similar meaning or not, according to a language context • abc abcabc ICANN Accra, Ghana, March 2002
3. CJK languages • Not alphabet-based • Ideographs • More than 100 thousand • Each of ideographs is a concept or a word ICANN Accra, Ghana, March 2002
3. Unicode v.s. ISO 10646 • Unicode Consortium • ISO Working Group responsible for ISO/IEC 10646 is JTC1/SC2/WG2 • Unicode and ISO 10646 tables are equivalent ICANN Accra, Ghana, March 2002
3. Simplified vs Traditional Chinese • People’s Republic of China works on simplification of characters, starting in 1950’s, very complex and long • Simplified Chinese used by one billion people nation • Traditional still exstensively exist in the social life, for its long history and artistic value • Development of standardized translation to Latin (cf. Pekin Beijing) ICANN Accra, Ghana, March 2002
4. The IETF works on IDN • Based on Unicode (there is nothing else) • Technical scope – expand today « LDH » 38 characters set into several tens of thousand of code points • Discovery of many problems • Combinatory effects (none will be able to use printer information without knowing which labguage script is used) • Mutual incompatibility in Unicode between Unified Han and Chinese language including Simplified Chinese ICANN Accra, Ghana, March 2002
Summary of problems and political dilemmas • Chinese Japanese Korean (CJK) • Latin Cyrillic Grek • … others but still unknown ICANN Accra, Ghana, March 2002
Problems • If the usage of mixed letters (code points) is allowed – the IETF works on Unicode cannot exlude it – there will be no more any unambigous printed URL • Consumers confusion • Many doubts for safe electronic commerce • Combinatory possibilities will increase by factor of hundred or thousand a domain name cost to those companies willing to have complete protection ICANN Accra, Ghana, March 2002
More problems • The « language » for IDN is undefined • The printed (paper or screen) information is undefined without knowing which script has been used • A printed information does not provide for unilateral guessing of company • How consumer will be able to contact a company if the only information he has is printed ? ICANN Accra, Ghana, March 2002
Political problem • Mutually inpossible to satisfy TC/SC and unified Han. Chinese are signatories to ISO10646 – does it means they give advantage to unified Han over TC/SC ? No • The IETF work on Unicode demonstrate there is a clash between Chinese language on one side and Korean and Japanese on another • Accepting Unicode is equivalent to take position against Chinese language • Only Chinese may solve it asking for Unicode changes ? ICANN Accra, Ghana, March 2002
Not enough works on languages • Unicode is a recent pot pourri, initially defined for printer’s industry, gathering not only languages, but anything which may be printed • But there is nothing else ICANN Accra, Ghana, March 2002
Questions • Do we agree the Unicode is unsuitable for Internet Domain Names ? • If yes, what to do now ? ICANN Accra, Ghana, March 2002