190 likes | 364 Views
Internationalization of Domain Names. James Seng <jseng@i-dns.net> CTO, i-DNS.net International co-chair, IETF IDN Working Group. Internationalized Domain Names. 华人.公司. cn 華人.商業. tw 高島屋.会社 .jp 삼성.회사 .kr 三星.회사. kr الاهرام.م viagénie .qc.ca ื ישראל . קום ทีเอชนิค.พาณิชย์.ไทย
E N D
Internationalization of Domain Names James Seng <jseng@i-dns.net> CTO, i-DNS.net International co-chair, IETF IDN Working Group
Internationalized Domain Names 华人.公司.cn 華人.商業.tw 高島屋.会社.jp 삼성.회사.kr 三星.회사.kr الاهرام.م viagénie.qc.ca ืישראל.קום ทีเอชนิค.พาณิชย์.ไทย 現代.com ヤフー.com
Internet Engineering Task Force • IETF is a large open international community of network designers, operators, vendors, and researchers concerned with the evolution of the Internet architecture and the smooth operation of the Internet. • It is open to any interested individual.
IETF IDN Working Group • Mailing List • General Discussion: idn@ops.ietf.org • To subscribe: idn-request@ops.ietf.org • Archive: ftp://ops.ietf.org/pub/lists/idn* • Website http://www.i-d-n.net/ • Chair • James Seng <jseng@pobox.org.sg> • Marc Blanchet <Marc.Blanchet@viagenie.qc.ca>
IDN WG Charter • The goal of the group is to specify the requirements for internationalized access to domain names and to specify a standards track protocol based on the requirements. • A fundamental requirement in this work is to not disturb the current use and operation of the domain name system, and for the DNS to continue to allow any system anywhere to resolve any domain name. • The group will not address the question of what, if any, body should administer or control usage of names that use this functionality.
Confusion • Language and Script • A language is a way that human interact • A script is the written form of a language • Many written languages share the same script • Some written languages use more than one script • Example, 現代.com • is this in Chinese, Japanese or Korean?
Confusion • Name and Identifier • Name is a word or phrase that constitutes the distinctive designation of a person or thing • Identifier is a string of characters that uniquely identify a person or thing • Example • “James Seng” is a Name • “jseng” is an Identifier • is jseng.com a Name or Identifer? • Domain Name is an Identifier not a Name
Confusion • Internationalization, Localization and Multilingualism • Internationalizing make the protocol able to handle more scripts • Localization involves tailoring interaction with users in the languages they know • Multilingualisam makes the protocol able to handle multiple languages • “I” in IDN is Internationalization
WWW Email Mail Format HTTP URI SMTP IDN is here DNS TCP/UDP Internet Protocol (IP) Snapshot of network layers to provide some Internet Service Problems
Problems • Stability • IDN is changing a fundamental service in the Internet Architecture • Backward Compatibility • A lot of existing Internet Protocols, applications uses domain names and assumed that it is A-Z, 0-9 and “-” only • Easy of Use • It should transparent to user if possible
Problems • Internationalization • Using one universal character set or multiples characters set? • What encoding to use? UTF-8? UTF-16? • Matching • yahoo.com = YAHOO.com • 華人.com = 华人.com ?
Work in Progress • Agreed that ISO10646/Unicode would be the base character set for IDN • ASCII Compatible Encoding (ACE) • A transformation encoding scheme of ISO10646 which resultant string is in LDH (Letter, Digital, Hypen). • Limited compression to produce shorter ACE • AMC-ACE-Z: draft-ietf-idn-amc-ace-z-01 • e.g. 新加坡.com zq--3bs3aw5wpa2a.com
Work in Progress • Nameprep/Stringprep • draft-ietf-idn-nameprep • Based on UTR#15 (Normalization) & UTR#22 (Case Mapping) with additional prohibited codepoint and folding rules • Precise and well-defined rules to normalization IDN before it is used • Allows accurate matching of IDN
Work in Progress • Internationalized Host Names in Applications (IDNA) • draft-ietf-idn-idna • IDNA only upgrade in applications to handle IDN • Consideration of legacy encoding and interopability • Enforce Nameprep in applications • Uses Nameprep-ACE’ed IDN over the wire
Work in Progress • IDNA-Nameprep-ACE Model Users Application DNS Resolver User Interface No defined encoding, probably local or utf-8 No defined encoding, probably native Unicode Nameprep’ed AMC-Z Nameprep-AMC
Outstanding issues • Localization • draft-ietf-idn-tsconv – Traditional/Simplified Chinese • draft-ietf-idn-hangeulchar – Hangeul normalization • draft-ietf-idn-jpchar – Japanese issues • Enhanced efficiency of ACE • draft-ietf-idn-lsb-ace – Reordering
Conclusion • Many confusions • Took us many months to understand the issues • Many problems • No solution is perfect • Engineering compromised • Culture consideration • Many feel very strongly about their own languages (but IDN can’t handle language)
Lastly • IDN Working Group have been around for nearly 20 months • Direction have become clearer and there are strong support around various solutions • It is coming! • Join the working group if you are interested! • idn-request@ops.ietf.org with the word “subscribe”