1 / 15

Przemysław Zdroik

This presentation discusses the challenges of handling diacritics, tokenisation, and custom word stress in TTS (Text-to-Speech) system development. It explores the usefulness of tokenisation and the impact of word stress on language rendering. Examples and considerations from the FT R&D position paper are presented.

Download Presentation

Przemysław Zdroik

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. 0x0141 0x0142 Przemysław Zdroik [ pshemýswuf zdróik ] IPA: pʃɛ.mɨ’.swaf zdrɔ’.ik [ pshemek ]

  2. TTS system development France Telecom owns Polish Telecom Speech technologies testing and evaluating „advanced user”

  3. FT R&D France / Poland Missing diacritics Tokenisation Custom word (lexical) stressPrzemysław Zdroik, Paul Bagshaw 2nd Workshopon Internationalizing SSML, Crete, 31st May 2006

  4. Plan of the presentation • A few words about diacritics • Usefulness of tokenisation • Word stress • Discussion Regarding: FT R&D position paper

  5. Text stripped of diacritics • Occurs in SMSes, mails, IM, news group posts • Additional pronunciation ambiguities appear: French: cure => cure vs. curé Polish: maki => maki vs.mąki Czech, Slovak, Slovenian … Similar problems (incorrect TTS input): • Lack of accents in Russian texts • Informal „romanisations” used in SMSes (Greek, Russian – volapuk encoding) Regarding: FT R&D position paper

  6. Question Shall we allow incorrect text as an input for TTS? In other words: shall we expect from a TTS system to try correcting the input marked as incorrect (e.g. as „sms_content” or „email”)? IMHO: It is a „nice to have” feature of TTS/SSML 1.1, but not essential one. Regarding: FT R&D position paper

  7. Examples of tokenisation usefulness (1/3) French: • The word couple “bien que” may be either a locution (to be considered as a single word; POS=conjunction) or two separate words (bien POS=adverb; que POS=conjunction). Il continue <token>bien que</token> ça soit perdu. Il faut <token>bien</token> <token>que</token> jeunesse se passe. • The phrase “rendez-vous” may also be considered as one word (POS=noun) or two words (rendez POS=verb; vous POS=pronoun) <token>Rendez-vous</token> de la semaine dernière. <token>Rendez</token><token>-vous</token> au 1 avril? Regarding: FT R&D position paper

  8. Examples of tokenisation usefulness (2/3) Arabic: The word فقد (fqd) can be segmented and vowelised in several ways: • Segmentation 1 : f + qd = ‘faqad’ (conjunction ‘fa’ + particle ‘qad’) • Segmentation 2 : fqd = ‘faqada’ (he has lost)  verb active ; ‘faqida’ (he was lost) passive verb ; ‘fuqdi’ (the loss) noun ; etc… لعبفيأكبرالنوادي<token>قد</token><token>ف</token> (He has played in the greatest clubs) Segmentation, in general, does not have to solve the ambiguities, but at least, it can decrease the number of possible pronunciations. Regarding: FT R&D position paper

  9. Examples of tokenisation usefulness (3/3) English: <token>with</token><token>or</token>without your help Polish: <token>z</token><token>lub</token>bez twojej pomocy Regarding: FT R&D position paper

  10. Question How tokenisation markup should affect rendering of languages with space word demarkation? Regarding: FT R&D position paper

  11. Word (lexical) stress • Czech – almost exception-free first syllable stress • Slovak – first syllable stress, some exceptions • Polish – general rule for penultimative stress, many exceptional rules • Russian • „moving” stress • no accentuation rules, • accute as a stress idicator - Unicode combining accute accent U+0x0301 (e.g. ы́ э́ ю́ я́ ) • In most texts, the accute is omitted Regarding: FT R&D position paper

  12. Word stress – Polish TTS engines In the three commercial TTS systems, custom stress can be indicated in non-standard way, by annotating accented vowel/syllable with a special character(s) e.g. gramatyka (grammar), irregular stress - on the third syllable from RealSpeak, Sayso: grama’tyka Ivona: gram~!atyka (None of the TTSes support IPA alphabet) Regarding: FT R&D position paper

  13. Question: How custom word (lexical) stress can be represented in SSML ? • By a dedicated tag ? <stress primary=”~” secondary=”*”>gram~atyka</stress> • By using special „phonetic” alphabet within the <phoneme> tag • Other proposals ? Regarding: FT R&D position paper

  14. Dziękujemy Thank you & let’s discuss Regarding: FT R&D position paper

  15. Prepared by: Name: Name: Przemyslaw Zdroik Krzysztof Majewski Division: Division: Vocal Services Secion Vocal Services Section TP S.A. Research and Development Centre TP S.A. Research and Development Centre Department: Department: (+ 48) 22 699 56 06 (+ 48) 22 699 55 64 Phone#: Phone#: Przemyslaw.Zdroik@telekomunikacja.pl Krzysztof.Majewski@telekomunikacja.pl E-mail: E-mail: Regarding: FT R&D position paper

More Related