110 likes | 203 Views
LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China. Suggestions on Tone and Word Boundary of Mandarin for SSML. Outline. Tone Word boundary. Tone (cont…). Importance As important as phonemes in tonal language Same syllables with different tones take different meaning:
E N D
LOU Xiaoyan, LI JianResearch and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML
Outline • Tone • Word boundary
Tone (cont…) • Importance • As important as phonemes in tonal language • Same syllables with different tones take different meaning: 妈(mā) 麻(má) 马(mă) 骂(mà) • Sandhi phenomenon in tonal language 你好 ni3 hao3 ni2 hao3 • Synthesis with correct tone help listener catch the meaning of speech • Non-markup behavior • Tone can be achieved by looking up dictionary or applying rules. • Errors may occur, especially in dealing with sandhi
Suggestion on Tone (cont…) • Our suggestions • Using Pinyin sequence as the value of phoneme element • Using number 1, 2, 3, 4 and 5 standing for tone “yin ping”, “yang ping”, “shang sheng”, “qu sheng” and neutral tone in Mandarin: Text: 大都(dàdoū) Pinyin sequence+tone: /da 4/dou 1/ • Solution1: new tone element (optional), with required attribute detail: <tone detail=“4 1”>大都</tone> • Solution 2: new value “t” and “pt”of alphabet attribute in phoneme element <phoneme alphabet=“t” ph=“4 1”> 大都</phoneme> <phoneme alphabet=“pt” ph=“da 4/dou 4”> 大都</phoneme>
Note on Tone Markup • Possible influence on SSML1.0 • Solution 1: Tone element cannot be followed by other element, and can be enclosed by p, s, w(if defined) element • Solution 2: phoneme element is modified, the relation to other elements should not change • The tone strings given by markup cannot be changed • in the text normalization step • in the result of looking up the lexicon. • Tone markup should be neglected, when • Value error of tone • Unmatched length of tone sequence
Outline • Tone • Word boundary
Word Boundary (cont…) • Word is the basic unit for sentence parsing and understanding. • Chinese sentences are composed of sequence of Chinese characters without blanks or spaces to specify word boundaries. • Difficulties: • Complex words, such as reduplications, derived words, such as “简简单单”(very easily), “非物质”(immateriality) • Proper nouns, such as location name, person name • The ambiguous word segmentations. A: 上海 是 个 大都会。(Shanghai is a metropolis) B: 上海人 大都 会 那么 说。(Most Shanghainese will say that) • Non-markup behavior • Determine the boundary using language-specific knowledge • Errors may occur
Suggestions on Word Boundary (cont…) • New element w is suggested <w>都会</w> • An optional attribute detail is also recommended to mark phrases <w detail=“3 2 1”>上海人大都会</w> Here, the phrase is split into three words, and the number of Chinese characters of these words are 3, 2 and 1.
Suggestion on Word Boundary (cont…) • Legal values of the optional attribute detail • Not bigger than the length of the contained text <w detail=“3”>上海</w> • Default value is the length of the contained text <w >上海</w> • When the sum of value is smaller than the length of the contained text, the left part is regarded as a word <w detail=“3”>上海人大都会</w> The first 3 Chinese characters “上海人”are regarded as one word and the left “大都会” are regarded as another word • When the sum of value is bigger than the length of the contained text, this markup should be neglected
Possible Influence on SSML 1.0 • Influence on speech synthesizing steps • Word segmentation is suggested to be done before parse text and analysis structure • Relation between SSML 1.0 markups and word segmentation markup w (needs more discussion) • p, s element can be followed by w element; • w element can be followed by audio, emphasis, phoneme, prosody, say-as, sub, voice and t(if defined) <p> <w detail=“2”>上海</w> </p> <w detail=“2”><prosody rate=“-10%”>上海</prosody></w>大都会