1 / 26

International Domain Name

International Domain Name. TWNIC Nai-Wen Hsu snw@twnic.net.tw. Domain name. RFC 1035 A label can not longer than 63 characters A domain name can not longer than 255 characters Maximum labels: 127 Only accept a-z,0-9, ’ - ’ as domain name

anniedsmith
Download Presentation

International Domain Name

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. International Domain Name TWNIC Nai-Wen Hsu snw@twnic.net.tw

  2. Domain name • RFC 1035 • A label can not longer than 63 characters • A domain name can not longer than 255 characters • Maximum labels: 127 • Only accept a-z,0-9,’-’ as domain name • Limited ASCII character code point, 37 LDH (Letter-Digit-Hyphen)

  3. International Domain Name • IETF IDN WG adopt UNICODE 3.2 • Greek, Cyrillic, Armenian, Hebrew, Arabic,Syriac, Thaana, Devanagari, Bengali,Gurmukhi, Gujarati, Oriya, Tamil, Telugu,Kannada, Malayalam, Sinhala, Thai, … • 95,156 characters

  4. International Domain Name sample • レコード会社.jp • gwmöbler.com • 慎昌鐘錶.tw • 阿克苏诺贝尔油漆公司.cn • 소프트웨어.kr • לארשי . םוק

  5. IETF IDN Standard • IDNA (RFC3490) • Internationalizing Domain Names in Applications • NAMEPREP(RFC3491) • A Stringprep Profile for Internationalized Domain Names • PUNYCODE(RFC3492) • A Bootstring encoding of Unicode for Internationalized Domain Names in Applications • STRINGPREP(RFC3454) • Preparation of Internationalized Strings

  6. IDNA components and interfaces User Input and display: local interface methods (pen, keyboard, ...) IDNA IDNA-aware Application (ToASCII and ToUnicode operations may be called here) End system Call to resolver ACE Application-specific Protocol: ACE Unless the protocol Is updated to handle Other encodings xn--de-jg4avhby1noc0d Resolver DNS Protocol ACE "Application" is where the application splits a host name into labels, sets the appropriate flags, and performs the ToASCII and ToUnicode operations. DNS Servers Application Servers

  7. IDNA Structure Nameprep: A Stringprep Profile for Internationalized Domain Names User input (UNICODE) IDNA • NAMEPREP • Mapping • Normalization • Prohibit STRINGPREP ToASCII ToUnicode ACE(PUNYCODE) ACE To resolver

  8. NAMEPREP • A Stringprep Profile for Internationalized Domain Names • Mapping • Stringprep table B.1,B.2 • Normalization • Form KC • Prohibited Output • Stringprep table C.1.2,2.2,3,4,5,6,7,8,9

  9. NAMEPREP -- Mapping • Commonly mapped to nothing: 27 • Ex: • Mapping for case-folding used with NFKC: 1371 • Ex:A  a (U+0041U+0061)  (U+03ABU+03CB) (U+3371U+0068 U+0070 U+0061)

  10. NAMEPREP -- Normalization • Unicode normalization with form KC

  11. NAMEPREP -- Normalization • ‘u’+‘‥’  ‘ü’ • ‘a’‘a’

  12. NAMEPREP – Prohibited output • Non-ASCII space characters: 17 • Ex: (NO-BREAK SPACE) • Non-ASCII control characters: 54 • Ex: (DEVICE CONTROL STRING) • Private use: 133371 • Non-character code points: 49 • Surrogate codes: 2048

  13. NAMEPREP – Prohibited output • Inappropriate for plain text: 4 • Inappropriate for canonical representation: 12 • Change display properties or are deprecated: 13 • Tagging characters: 97

  14. PUNYCODE • A Bootstring encoding of Unicode for IDNA • One of the ACE(ASCII Compatible Encoding) • Translate non-ASCII characters to ASCII characters • Prefix: xn-- • Ex:慎昌鐘錶.tw  xn--ciun9hb52c2za.tw

  15. Insufficient in IDN standard • Current IDN standard (IDNA, NAMEPREP, PUNYCODE) can not solve Chinese domain name requirement • Tradition/Simplify Chinese mapping • Ex: 台  臺 • Writing variant mapping • Ex: 峰  峯

  16. Insufficient in IDN standard • They are the same meaning but it is different character in different countries • In China: • 劝(529D) • In Japan: • 勧(52E7) • In Taiwan: • 勸(52F8)

  17. IDN administration guide line • Registration policy to solve those problems listed above • Every language has a variant table with 3 fields: • valid code point • recommended variant • character variant

  18. Variant Table sample

  19. Variant Table sample

  20. Variant Table • Singular-relation character (VCP=twRV=cnRV=CV): 13888(66.4%) • VCP=twRV≠cnRV: 2783 (13.3%) • VCP=cnRV≠twRV: 2453(11.7%) • VCP≠(twRV=cnRV): 333(1.6%) • VCP≠twRV≠SCR: 387(1.9%)

  21. Variant Table

  22. Variant Table • The table draft is prepared by the CCMT Task force • organized by TWNIC from January, 2002. • Task force members have 9 experts from • language linguist, computer experts and DNS experts. • The table draft has submitted to the Bureau of Standards, • Ministry of Economic Affairs to final review.

  23. Registration procedure • A Registrant should select the language(s) • Activation of the requested domain name(s) & Reservation of the equivalence(s) should be provided by the Registry, within the language-based character set • The registrant can require the activation of the reserved equivalent domain name(s) at any time

  24. Registration sample • A user select zh-tw and zh-cn language with domain name 丁上萬.com • 丁上萬.com (Recommended variants for zh-tw) • 丁上万.com (Recommended variants for zh-cn) • 丁丄万.com (Character Variant) • 丁丄萬.com (Character Variant)

  25. Q & A

More Related