350 likes | 869 Views
Alphabets of Languages with Bidirectional Scripts and their Support. Israel Ervin Gidali IBM Globalization Centre of Competency- Complex Text Languages. Agenda. The predecessors of the first true alphabets The first alphabets Direction of writing The “modern” RTL scripts
E N D
Alphabets of Languageswith Bidirectional Scriptsand their Support Israel Ervin Gidali IBM Globalization Centre of Competency- Complex Text Languages
Agenda • The predecessors of the first true alphabets • The first alphabets • Direction of writing • The “modern” RTL scripts • Bidirectionality • Bidi– some of the challenges • Implementation aspects
The Alphabet Predecessors The predecessors of the first true alphabets: The Egyptian hieroglyphics (since 3000 BCE) The Mesopotamian cuneiforms (since 3100 BCE)
The Egyptian Hieroglyphs Pictograms Logograms Phonograms
The Cuneiform Writing Systems logo-syllabic syllabic words
The First Semitic Alphabets Proto-Sinaic and Proto-Canaanite. • Originated around the 18th or 17th centuries BCE, under the influence of Egyptian hieroglyphs.
The First Semitic Alphabets • The revolution: purely phonetic (only consonants without vowels). • Influenced originally by the polyphony practice in hieroglyphic and cuneiform scripts.
Proto-Canaanite • Limit the set of sounds to 22 consonants only, still without vowels. • Acrophonic. • Letters easy to distinguish and remember (their shapes resemble familiar objects).
The alphabet success Proto-Canaanite, Phoenician and Greek
Direction of Writing • Hieroglyphs were written in both directions. • Starting from the 11th century BCE, the writing direction of all Semitic scripts (except Ethiopic) is from right to left.
The “Modern” RTL scripts אין כל-חדש תחת השמש (קהלת פרק א פסוק ט') There is nothing new under the sun. (Qohelet/Ecclesiastes 1/9)
Arabic Script – the Script of Islam The Arabic script, the script of Quran, used for: • Arabic • Persian (Farsi) • Urdu • Ottoman Turkish (until 1929) • Uighur, Kazakh,Uzbek, Tajik, Kirghiz, • Old Malay, Swahili, Hausa, Baluchi, Kashmiri, Sindhi, Pashto, Lahnda,Dargwa,Morrocan Arabic, Adighe, Ingush, Berber,Kurdish, Jawi/Javanese……
Hebrew Script עברית שפה יפה • Used for: • Hebrew • Yiddish • Ladino (Judezmo) • Arabic • Karaite/Karaim • Turkish
Hebrew script and diacritics Hebrew text: בראשית ברא אלהים את השמים ואת הארץ Vocalized with “points” and cantillation marks:
Decimal digits forms European digits (Arabic digits): (Used in Hebrew script and in some Arabic countries) Arabic-Indic digits: (Used in Arabic) Numbers are written from left to right regardless of their form and regardless of regional variety
Bidirectionality • National language (Arabic, Hebrew, etc...) text is written from Right to Left TXET CIBARA • Numbers and English (or French,Russian, etc.) text is written from Left to Right english text 123 TXET CIBARA
Bidi Appearance Aspects- Directionality • Mixed direction of text segments: • Page alignment on the right • Book binding on the right • Mirroring of GUI elements (only when translated)
Bidi Data Processing Aspects – some of the challenges • Bidirectional text data entry • Visual versus Logical text type • The Paragraph Orientation • Arabic script cursiveness: shaping and ligatures • Variety of text layouts in use
The Bidi Layout Challenges • Bidirectional text in different systems and applications has multiple possible layouts • In heterogeneous environments proper layout transformations should be performed • Higher order protocols integration
The Challenge of GUI Mirroring • When translating the interface of an application to a language with Bidirectional script, provisions must be made to ensure that the GUI is properly mirrored.
Implementation Aspects • Almost all platforms and Operating Systems provide support for Bidirectional text entry and processing • New platforms should react to this challenge too • Except for adequately engraved keyboards, there is no need for special hardware for Bidi text support.
Last Word • RTL scripts are not a novelty. As a matter of fact they have preceded the current Western world scripts • Their support is different but not necessarily much more complex, as long as one is prepared for it.
The End Thank You