1 / 18

Unicode

Unicode. A short introduction. Outline. Writing systems Computers and Alphabets ASCII versus Unicode Examples of Unicode For more information. Writing sytems. Each letter of the alphabet has A name (sometimes several) E.g., ‘a’ is called ‘eh?’ E.g., ‘a’ is also called ‘lower-case a’

aurek
Download Presentation

Unicode

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unicode A short introduction

  2. Outline • Writing systems • Computers and Alphabets • ASCII versus Unicode • Examples of Unicode • For more information

  3. Writing sytems • Each letter of the alphabet has • A name (sometimes several) • E.g., ‘a’ is called ‘eh?’ • E.g., ‘a’ is also called ‘lower-case a’ • A pronunciation • e.g., in Cayuga, ‘a’ usually sounds like’aaah’

  4. Computers and Alphabets • Computers also have names for letters • The letter ‘a’ is called ‘61’ in one ‘language’ • The letter ‘a’ is called ‘U+0061’ in another ‘language’

  5. Computers and Alphabets • Why do computers use numbers for the names for letters? • Because they store all information in number form. • Technical detail: they store information as ‘bytes’; each ‘byte’ consists of 8 ‘bits’; each ‘bit’ is either the number ‘0’ or ‘1’’

  6. Computers and Alphabets • Computer programs ‘translate’ the names into letters on the screen or in print • Result: you see a, a, a, a, a, a, etc. (different fonts, but the same letter)

  7. ASCII versus Unicode • ASCII (American Standard Code for Information Exchange) • ASCII and Unicode are two computer ‘languages’ for naming letters • The ASCII name for ‘a’ is ‘61’ • The Unicode name for ‘a’ is ‘U+0061’

  8. ASCII • Computer systems can represent up to 256 letters • Technical detail: with one 8-bit byte (28 = 256) • Another technical detail: ASCII only uses 7 bits (27 = 128) • The first 32-127 are called ASCII letters (characters)

  9. ASCII • On all computers, the ASCII letters named 32-127 look the same. • E.g., ‘35’ looks like upper-case ‘A’ on all modern computers • Technical detail: why only up to 127? Excluding 0, that’s all you can represent with 27 combinations of bits) • Another technical detail: what happened to 1-32? That’s for control characters like the ‘option’ key.

  10. ASCII Problem • While computers can represent 256 names for letters, no one agrees on what letters the numbers 128-256 stand for. • That’s why your Mac Cayuga font shows up as gibberish on a Windows computer. • The number ‘250’ doesn’t mean the same thing on a Mac as it does on a Windows PC.

  11. Unicode: fixing the ASCII problem • Unicode aims to provide a unique name for every letter ever used…on the planet. • It has room for 1,000,000 names. • Everyone agrees on what letters the names stand for. • Technical details not discussed here: getting from names like ‘U+0061’ to the letter ‘a’ on your computer.

  12. Unicode • Many letters have already been given an Unicode name. • Modern computers can display any letter that has a Unicode name.

  13. Unicode and Syllabics • The Cherokee syllabary is represented in the Unicode character block U+13A0 - U+13FF. • Cherokee letter representing the syllable ‘tay’ Ꮦ

  14. Unicode and Syllabics • Unified Canadian Aboriginal Syllabics are represented in the Unicode character block U+1400 - U+167F. • The ‘ee’ sound in most Canadian syllabics systems: ᐃ

  15. Unicode and Cayuga • There’s no special character block for Cayuga • That’s because all the Cayuga characters can be made up from already existing Unicode characters • the Unicode Consortium won’t let you duplicate already existing characters.

  16. Unicode and Cayuga • Sgę:nǫ:⁷ swagwé:gǫh

  17. Advantages of Unicode • (Not quite yet, but in the near future) when you type in Cayuga, it will appear as Cayuga on any other computer. • The same goes for web pages…

  18. For More Information • Lots of technical details not discussed here. • Take one of the CDs provided if you want a more extensive introduction.

More Related