250 likes | 366 Views
SSML Extensions for Chinese Voice Browsing. Helen MENG, Wai-Kit LO, Tien-Ying FUNG, Yuk-Chi LI and Zhiyong WU Human-Computer Communications Laboratory Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2nd November, 2005. Outline.
E N D
SSML Extensions for Chinese Voice Browsing Helen MENG, Wai-Kit LO, Tien-Ying FUNG, Yuk-Chi LI and Zhiyong WU Human-Computer Communications LaboratoryDepartment of Systems Engineering and Engineering ManagementThe Chinese University of Hong Kong 2nd November, 2005
Outline • Characteristics of Chinese • Proposed attributes for existing elements • “dialect-accent” • Proposed elements • <phrase> and <word> • <tone> • Proposed attribute values • for “interpret-as” attribute in <say-as> element • Summary
Characteristics of Chinese • Rich in dialects, e.g., Cantonese, Shanghaiese, Mandarin • Write alike, speak differently • similar writing system; e.g., 中国 and 中國 • significantly different pronunciations • Mandarin with different accents • No explicit phrase and word boundaries • e.g., 我們現在在開電話會議 (we are) (now) (having) (a teleconference) • proper segmentation is critical for prosodic control, pronunciation selection for homographs and resolution of semantic ambiguity • Monosyllabic and tonal • Syllable + Lexical Tone lexical meaning of Chinese character • tone can change according to meaning, context, mode of speaking
Phonetic Transcription Schemes • Pronunciation of a character = tonal syllable = syllable + tone • Many transcription schemes developed for different dialects • syllable in Roman alphabets • tone as a one-digit Arabic number • Popular schemes are • pinyin (for Mandarin)銀行 (bank): /yin2 hang2/ • jyutping (for Cantonese) 銀行 (bank): /ngan4 hong4/
level of F0 time Chinese Tone Systems Figure 1. Mandarin tone system (4 tones + 1 ‘light’ tone) (2). 陽平/yang ping/,low levele.g., 麻 (1). 陰平/yin ping/,high levele.g., 媽 (3). 上/shang/,risinge.g., 馬 (4). 去/qu/,goinge.g., 罵 (1). 陰平, high levele.g., 詩 (2). 陰上, high risinge.g., 史 (3). 陰去, high goinge.g.,試 8(3). 中入,middleenteringe.g., 舌 9(6). 陽入,low enteringe.g., 蝕 7(1). 陰入,high enteringe.g., 色 (5). 陽上, low risinge.g., 市 (4). 陽平, low levele.g., 時 (6). 陽去, low goinge.g., 事 Figure 2. Cantonese tone system (9 tones, specified in 6 classes)
“dialect-accent” Beijing Mandarin Guangdong Mandarin Hong Kong Cantonese
Proposed “dialect-accent” Attribute • Specify dialects and accents in a language • use with xml:lang [XML1.0] • dialect-accent = primary-subtag[“-”optional-subtag] • primary-subtag = 2ALPHA • specify dialect • e.g., MD for Mandarin, CT for Cantonese • optional-subtag = 2ALPHA • specify accent • e.g., BJ for Beijing, GD for Guangdong, HK for Hong Kong • follows the abbreviations of Chinese provinces, autonomous regions and special administrative regions listed in the EDU.CN Domain Policy (中國教育和科研計算機網 EDU.CN 網絡域名註冊辦法)1 • examples • Mandarin in Beijing and Guangdong accent: MD-BJ, MD-GD • Cantonese in Hong Kong and Guangdong accent: CT-HK, CT-GD 1 Defined by the China Education and Research Network Information Centre (CERNET網絡信息中心)
xml:lang values Dialect Accent “dialect-accent” value zh-HK Cantonese Hong Kong CT-HK Guangdong CT-GD Mandarin Hong Kong MD-HK Beijing MD-BJ Taiwan MD-TW “dialect-accent” Attribute (continue) <p>Hello, where are you from?</p> <p xml:lang="zh-CH" dialect-accent="MD-BJ"> 我(I am) 從(from) 北京(Beijing) 來的。</p> <p xml:lang="zh-CH" dialect-accent="MD-GD"> 我(I am) 從 (from) 廣東(Guangdong) 來的。</p> <p xml:lang="zh-CH" dialect-accent="CT-HK"> 我(I am) 從 (from) 香港(Hong Kong)來的。</p> Mandarin withBeijing accent Mandarin with Guangdong accent Cantonese with Hong Kong accent
Enrich <p>, <s> with <phrase>, <word> • Current SSML 1.0: <p> and <s> • Proposed elements: <phrase> and <word> • Serve as cues for prosodic control (e.g., pause) • Assist correct pronunciation selection for homographs • A Cantonese example • The character 行has FIVE pronunciations /haang4/ 行山(hiking) /hang6/ 品行(discipline) /hong2/ 洋行(foreign trading company) /hong4/ 銀行(bank) /hang4/ 行人(pedestrian)
Proposed <phrase> Element • Definition: • Defines the course of a Chinese phrase • No attributes • Occurs within <s> • These elements can be nested within <phrase> • <audio>, <break>, <emphasis>, <mark>, <phoneme>, <prosody>, <say-as>, <sub>, <voice>, <word> • Example (an ancient poem) 終年倒運少有餘財 • Pessimistic phrasing • <phrase>終年倒運</phrase> <phrase>少有餘財</phrase> • Optimistic phrasing • <phrase>終年倒運少</phrase> <phrase>有餘財</phrase> Whole year unlucky Not much money left Only with a few unlucky events in the year Have money left
Proposed <word> Element • Definition: • Defines the course of a Chinese word • No attributes • Occur within <s> and <phrase> • These elements can be nested within <phrase> • <audio>, <break>, <emphasis>, <mark>, <phoneme>, <prosody>, <say-as>, <sub>, <voice> • Example 這一晚會如常舉行 • Segmentation 1 • <word>這一</word> <word>晚會</word> <word>如常</word> <word>舉行</word> • Segmentation 2 • <word>這一晚</word> <word>會</word> <word>如常</word> <word>舉行</word> /wui2/ 1. This banquet is held as usual This banquet as usual hold /wui3/ 2. Tonight will be held as usual Tonight will as usual hold
Proposed <tone> Element • Tone • Important in Chinese pronunciation • Tones can vary according to differences in meaning, context and mode of speaking • 相 • in tone 2 meansphoto • in tone 3 means facial appearance / minister • Current SSML 1.0: phoneme • Requires pronunciation transcription • Example <phoneme alphabet="x-lshk-jyutping" ph="soeng2">相</phoneme> <phoneme alphabet="x-lshk-jyutping" ph="soeng3">相</phoneme> • Proposed <tone> element • with the required “value” attribute <tone value="2">相</tone> (photo) <tone value="3">相</tone> (face appearance) • inherit the alphabet attribute, or explicitly specify
Examples of Using “tone” Element • Tone changes on meaning • 糖 (candy / sugar) <tone value="2">糖</tone> (tone 2 /tong2/: means candy) <tone value="4">糖</tone> (tone 4 /tong4/: means sugar) • Tone changes on context • 爺 (grandfather) 阿<tone value="4">爺</tone> (tone 4 /je4/: preceded by 阿) 爺<tone value="2">爺</tone> (tone 2 /je2/: preceded by 爺) • Tone changes on mode of speaking: • 英文(English) 英<tone value=“4">文</tone> (tone 4 /man4/: formal mode) 英<tone value="2">文</tone> (tone 2 /man2/: colloquial mode)
Proposed Legal Values for “interpret-as” Attribute • VoiceXML2.0 Appendix P • boolean, date, digits, currency, number, phone, time • SSML 1.0 <say-as> attribute values (W3C Working Group Note 2005) • date, time, telephone, characters, cardinal, ordinal • Propose 6 new values: • Chinese-name, • fraction, • measure, • net, • percentage, • ratio
“Chinese-name” Value • Specify as name to aid pronunciation selection • 單明明:單/daan1/ /sin6/ (surname) 明明/ming4 ming4/ /ming4 ming2/ (given name) • Format: S*G* • S: surname, G: given name • Examples • <say-as interpret-as=“Chinese-name” format=“SG”>姚明</say-as> (Yao Ming) • <say-as interpret-as=“Chinese-name” format=“SGG”>單明明</say-as>(Sin Ming Ming) • <say-as interpret-as=“Chinese-name” format=“SSG”>歐陽修</say-as>(Au-yeung Sau)
“fraction” Value • Specify as fraction • e.g. 3/4 • Verbalization of fraction in Chinese: • with an additional word: 分之(out of) • A / B (Aout ofB): B分之A[note that the order is reversed!] • e.g. 3/4 is verbalized as 四(four)分之(out of)三(three) • “format” and “detail” attributes not required • Example 我吃了3/4個橙 (I) (ate) (orange) 我吃了<say-as interpret-as="fraction">3/4</say-as>個橙 我吃了四分之三個橙 (I ate three-fourth of the orange)
“measure” Value • Specify as measurement • e.g. 10cm, 30ml • measurement = number + unit • number [VoiceXML2.0]; e.g. 10 is ten (not one zero) • unit: translated and pronounced in Chinese, e.g. cm is 厘米, g is 克, oz is 安士, yd is 碼 • “format” and “detail” attributes not required • Example 他的身高是180cm 他的身高是<say-as interpret-as="measure">180cm</say-as> 他的身高是一百八十厘米 (his height is 180cm) • (his) (height) (is)
“net” Value • Specify as URI or email address • Possible ways to verbalize a URI: • Read the whole string in English, including punctuations • Omit http:// (ftp://, etc.), read the rest in English • Read alphabets in English, punctuations in Chinese • “format” attribute value: “email” or “uri” • Example 詳情請瀏覽http://www.w3.org (for details) (please) (browse) • Possible verbalizations: • H T T P colon slash slash W W W dot W three dot O R G • W W W dot W three dot O R G • W W W 點 W 三 點 O R G (點:dot三:three) [Similarly the protocol part may be kept as another option] 詳情請瀏覽<say-as interpret-as="net" format="uri"> http://www.w3.org </say-as>
“percentage” Value • Specify as percentage • Verbalization of percentage in Chinese • with an additional word: 百分之(out of a hundred) • A%: 百分之A • e.g. 70% is verbalized as 百分之(out of a hundred)七十(seventy) • “format” and “detail” attributes not required • Example 海洋約佔全球總面積的70% 海洋約佔全球總面積的<say-as interpret-as="percentage">70%</say-as> 海洋約佔全球總面積的百分之七十 (ocean covers 70% of global surface) • (ocean) (covers) (global) (surface)
“ratio” Value • Specify as ratio • e.g. 1:3 • Verbalization of ratio in Chinese: • with an additional word: 比(to) • A:B (A toB): A 比B • e.g. 1:99 is verbalized as 一(one)比(to)九十九(ninety nine) • “format” and “detail” attributes not required • Example 用1:99 的稀釋漂白水 用<say-as interpret-as="ratio">1:99</say-as>的稀釋漂白水 用一比九十九的稀釋漂白水 (use diluted bleach at a ratio of 1:99) • (use) (diluted) (bleach water)
Summary • “dialect-accent” attribute to enrich the xml:lang attribute • <phrase> and <word> for text processing • <tone> for pronunciation • 6 values for “interpret-as” attribute • Chinese-name • fraction • measure • net • percentage • ratio