230 likes | 384 Views
and. Unicode. Chris Pratley Group Program Manager Microsoft Word. Overview. Office Unicode history and strategy Implementation Benefits of Unicode to Office users Demo of Word. Office97 Unicode Strategy. Office97: first Unicode release (Jan. 94 - Nov. 96) Office97 driving factor s
E N D
and Unicode Chris Pratley Group Program Manager Microsoft Word
Overview • Office Unicode history and strategy • Implementation • Benefits of Unicode to Office users • Demo of Word
Office97 Unicode Strategy • Office97: first Unicode release (Jan. 94 - Nov. 96) • Office97 driving factors • Customers operate world-wide(US only 40%) • Need to handle multiple code pages in Europe • Office97 goals • Enable loss-less file exchange world-wide • Solve code page problems in Europe • Development efficiency for Asian and Euro versions • Unified source code base – but still different executables • Unified development process • Delta between language versions shrinks from 18 to 2 months • Lay foundation for future
Office2000 Unicode Strategy • Office2000 goals • Reduce Total Cost of Ownership for large corporations • Single version to deploy and administer globally • Configurable interface to handle local needs • Language of User Interface can be changed • Additional language features can be enabled as needed • Emulate any localized version • “Français”, “日本語”, “한글”, “עברית“,”عربي”, etc. • Streamline development process further • Core “US” team ships global product • Integrate bi-directional version team (Arabic, Hebrew) • Focus on needs of bilingual and multilingual users
Officexp Unicode Strategy • Officexp goals • Finish the globalization work begun in Office2000 • Extend functionality to all applications • Integrate Complex Scripts support (Indic, Thai, Vietnamese) • हिन्दी, தமிழில், ภาษาไทย, Việt • Streamline development process further • Single build process from start to finish • Integrate complex scripts team • Deepen Unicode support • Unicode 3.0 languages (ᐃᓄᒃᑎᑐᑦ, አማርኛ, etc.) • UTF-16 (esp. plane 2: 𧆓𨣓𨲄𪀒) • More complex script and limited combining diacritic coverage
Bi-Di Thai/Indic US/Euro JPN KOR CHT CHS SBCS 2.0/5.0 SBCS DBCS DBCS DBCS DBCS SBCS SBCS 6.0/95 SBCS Wide Wide Wide Wide SBCS Unicode 97 Unicode Unicode Unicode Unicode(now w/ Indic) 2000 Unicode A single Unicode release! 2002 (“xp”) The Word Family Tree Version
Implementation • Core applications are Unicode internally • Word, Excel, PowerPoint (Office97) • Access, Publisher (new in Office2000) • Databases and drivers are Unicode • FrontPage (new in Officexp) • Outlook – still only Unicode in mail • Uses Word as mail editor by default
Implementation • Difficulties encountered with Unicode • Lack of full system support in Win9x • Every app needed different solution • MFC-based apps were hardest • Missing system services (e.g. font-linking) • Interoperation with code-page based systems • Educating test team about Unicode • Testing issues different vs. MBCS • Lack of expertise in uncommon languages
Implementation • Office shared code services • Central Win32 Unicode text API “wrappers” • Simulate nearly full support on Win9x • ExtTextOutW and others • Provide optional font-linked output • Hardcode “preferred fonts” by script, style • User-specified font-fallbacks via reg key (if any) • Font categorization by script range (use MLANG.DLL) • Font substituted if glyph not available • Word modifies font settings in the document • Other apps do only at display time • Insert Symbol dialog (Unicode 3.1 support)
Office Users Benefit • Single binary world-wide • Shared world-wide file formats • Multilingual word/data processing • Unicode HTML • Unicode e-mail (HTML, RTF, plain)
Single Binary • Easier to deploy, administer • One set-up image to install world-wide • One set of service packs for all machines • All features available in all “versions” • Still have local version packages • Multilingual users can use “foreign” features • User Interface language is configurable • Your language follows you when you travel • Major cost savings for customers • Less testing of corporate solutions • Lower internal tech support costs
Single File Format • Multinational corporations use Office • Need to exchange documents company-wide • Office unified file formats via Unicode • Word95 had 7 different file formats • Word97 had 1 file format but no editing, layout for languages covered by other versions • Word2000 adds editing, layout, and full-roundtrip • Word2002 adds full complex script support
Multilingual Usage • English Officexp: input/display/edit/layout of • European languages • any similar left-right scripts if fonts/NLS available • E.g. Canadian Syllabics (Inuktitut), Ethiopic, Cherokee • Some combining diacritic support (African languages) • East Asian languages • Chinese (Traditional and Simplified), Japanese, Korean, Yi • Hong Kong supplemental characters, CNS 11643, GB-18030(via UTF-16 “surrogates”) • Complex Script and Bi-directional scripts (need enabled system) • Arabic (incl. Farsi, Urdu), Hebrew • Thai • Hindi, Tamil, Oriya, Telugu, Punjabi, Bengali, Gujarati, etc.
Multilingual Usage • Most documents are monolingual • Most users are bilingual • Local language • English • Optimize UI for using one, two or three languages • Over 200 supported – rare usage • Detect 20+ languages while typing (Word) • Automatically install and use the correct proofing tools • Plain text I/O in any encoding (Word, Excel) • Includes GB18030 on properly configured system
Multilingual Word Processing • Proofing tool interfaces are Unicode • SDKs available for 3rd party development • Tools for over 35 languages available • European languages, Japanese, Chinese, Korean, Arabic, Hebrew, Thai, Hindi… • Spelling, Grammar, Hyphenation, Thesaurus • Traditional/Simplified Chinese conversion • Japanese character usage consistency checker • Hangul/Hanja conversion • Translation dictionaries (available offline) • Automatic translation web services
Multilingual Data Processing • Access databases are Unicode • Hook up to SQL7.x/2000 Unicode databases • Excel workbooks are Unicode • Hook up to Unicode databases using OLE-DB • Create Pivot lists and manipulate Unicode data • PowerPoint creates multilingual multimedia • Web sites, animations
Web sites • URLs transmitted in UTF-8 (before the “?”) • FrontPage • Create and edit web pages in Unicode • Word • WYSWYG Web pages • Save in full or “filtered” HTML • IE5.x, IE6 • Display Unicode 3.1 pages
Mail and PIM • Outlook • Send/receive mail in any encoding • Use Word to edit mail for richest experience
Unicode HTML • HTML is a companion file format • Roundtrip all formatting • Optional HTML Filter cuts file size for publishing • Save to web servers directly • Roundtrip Unicode data in any encoding • UTF-8 and UTF-16 are supported too • HTML is tagged with encoding
Unicode e-mail • Office2000/XP provides fully multilingual email • HTML mail uses internet standards • All Unicode content preserved • Plugs into Outlook, Exchange • Use Word to compose replies and new messages • Send in plain text, RTF, or HTML • All applications can mail documents as HTML
Future Directions • Help Windows build a worldwide platform • Ensure system support is useful to app writers • Unicode 3.1 languages too • Extend Unicode support to more apps • Outlook • Combining diacritics, OpenType support
Questions Answers