150 likes | 164 Views
This presentation discusses the challenges of handling diacritics, tokenisation, and custom word stress in TTS (Text-to-Speech) system development. It explores the usefulness of tokenisation and the impact of word stress on language rendering. Examples and considerations from the FT R&D position paper are presented.
E N D
0x0141 0x0142 Przemysław Zdroik [ pshemýswuf zdróik ] IPA: pʃɛ.mɨ’.swaf zdrɔ’.ik [ pshemek ]
TTS system development France Telecom owns Polish Telecom Speech technologies testing and evaluating „advanced user”
FT R&D France / Poland Missing diacritics Tokenisation Custom word (lexical) stressPrzemysław Zdroik, Paul Bagshaw 2nd Workshopon Internationalizing SSML, Crete, 31st May 2006
Plan of the presentation • A few words about diacritics • Usefulness of tokenisation • Word stress • Discussion Regarding: FT R&D position paper
Text stripped of diacritics • Occurs in SMSes, mails, IM, news group posts • Additional pronunciation ambiguities appear: French: cure => cure vs. curé Polish: maki => maki vs.mąki Czech, Slovak, Slovenian … Similar problems (incorrect TTS input): • Lack of accents in Russian texts • Informal „romanisations” used in SMSes (Greek, Russian – volapuk encoding) Regarding: FT R&D position paper
Question Shall we allow incorrect text as an input for TTS? In other words: shall we expect from a TTS system to try correcting the input marked as incorrect (e.g. as „sms_content” or „email”)? IMHO: It is a „nice to have” feature of TTS/SSML 1.1, but not essential one. Regarding: FT R&D position paper
Examples of tokenisation usefulness (1/3) French: • The word couple “bien que” may be either a locution (to be considered as a single word; POS=conjunction) or two separate words (bien POS=adverb; que POS=conjunction). Il continue <token>bien que</token> ça soit perdu. Il faut <token>bien</token> <token>que</token> jeunesse se passe. • The phrase “rendez-vous” may also be considered as one word (POS=noun) or two words (rendez POS=verb; vous POS=pronoun) <token>Rendez-vous</token> de la semaine dernière. <token>Rendez</token><token>-vous</token> au 1 avril? Regarding: FT R&D position paper
Examples of tokenisation usefulness (2/3) Arabic: The word فقد (fqd) can be segmented and vowelised in several ways: • Segmentation 1 : f + qd = ‘faqad’ (conjunction ‘fa’ + particle ‘qad’) • Segmentation 2 : fqd = ‘faqada’ (he has lost) verb active ; ‘faqida’ (he was lost) passive verb ; ‘fuqdi’ (the loss) noun ; etc… لعبفيأكبرالنوادي<token>قد</token><token>ف</token> (He has played in the greatest clubs) Segmentation, in general, does not have to solve the ambiguities, but at least, it can decrease the number of possible pronunciations. Regarding: FT R&D position paper
Examples of tokenisation usefulness (3/3) English: <token>with</token><token>or</token>without your help Polish: <token>z</token><token>lub</token>bez twojej pomocy Regarding: FT R&D position paper
Question How tokenisation markup should affect rendering of languages with space word demarkation? Regarding: FT R&D position paper
Word (lexical) stress • Czech – almost exception-free first syllable stress • Slovak – first syllable stress, some exceptions • Polish – general rule for penultimative stress, many exceptional rules • Russian • „moving” stress • no accentuation rules, • accute as a stress idicator - Unicode combining accute accent U+0x0301 (e.g. ы́ э́ ю́ я́ ) • In most texts, the accute is omitted Regarding: FT R&D position paper
Word stress – Polish TTS engines In the three commercial TTS systems, custom stress can be indicated in non-standard way, by annotating accented vowel/syllable with a special character(s) e.g. gramatyka (grammar), irregular stress - on the third syllable from RealSpeak, Sayso: grama’tyka Ivona: gram~!atyka (None of the TTSes support IPA alphabet) Regarding: FT R&D position paper
Question: How custom word (lexical) stress can be represented in SSML ? • By a dedicated tag ? <stress primary=”~” secondary=”*”>gram~atyka</stress> • By using special „phonetic” alphabet within the <phoneme> tag • Other proposals ? Regarding: FT R&D position paper
Dziękujemy Thank you & let’s discuss Regarding: FT R&D position paper
Prepared by: Name: Name: Przemyslaw Zdroik Krzysztof Majewski Division: Division: Vocal Services Secion Vocal Services Section TP S.A. Research and Development Centre TP S.A. Research and Development Centre Department: Department: (+ 48) 22 699 56 06 (+ 48) 22 699 55 64 Phone#: Phone#: Przemyslaw.Zdroik@telekomunikacja.pl Krzysztof.Majewski@telekomunikacja.pl E-mail: E-mail: Regarding: FT R&D position paper