1 / 21

Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India

SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML 30-31 May 2006, Greece. Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India. Topics. About Bhrigus Collaborative Efforts between Bhrigus and IIIT Hyderabad

yetta
Download Presentation

Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SSML Extensions for TTS in Indian LanguagesII workshop on Internationalizing SSML 30-31 May 2006, Greece Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India

  2. Topics • About Bhrigus • Collaborative Efforts between Bhrigus and IIIT Hyderabad • Nature of Indian language scripts – convergence and divergence • Issues across TTS rendering in all these languages • Proposed solutions/tags: • Syllable Element • Alien Element • Dialect Element

  3. Bhrigus voice & data solutions http://www.bhrigus.com

  4. About Bhrigus Established : 2002 Business : Providing IVR, Speech & Enterprise solutions to BFSI, Telco’s, contact centers & manufacturing companies. Key Customers : Hewitt Associates, AT&T, Pfizer, Merrill Lynch, Union pacific railroad, CDIA, South western energy, Orange county, Stryker • SEI CMM Level 4 Process Implementation undergoing, ISO 9001: 2000 – KPMG certified.

  5. Speech and Language Technology Lab @ Bhrigus • Playing a leadership role in the development of ASR and TTS for all official Indian languages to provide voice solutions for Indian market • Collaborations: IIIT Hyderabad, & Carnegie Mellon University • 10 member team + board of advisors • 3 PhDs and 4 Masters • Synthesis team, Recognition team, Linguist team and Language resources team • Initiating SSML and VXML chapters in India

  6. Collaborative Efforts • Bhrigus Inc. Hyderabad – Voice based solution providers • IIIT Hyderabad – one of the leading universities in India doing speech research • Telugu TTS – Collaborative Efforts between Bhrigus Inc. and IIIT • Goal: Develop ASR and TTS for all official Indian languages

  7. Nature of Indian Language (IL) Scripts • Basic units of the writing system are Aksharas • An Akshara is an orthographic representation of a speech sound • Akshara is syllabic in nature, typical forms are V, CV, CCV and CCCV (C – consonant, V – vowel) • Always ends with a vowel (or nasalized vowel) in written form • ~1652 dialects/native languages • 22 languages officially recognized

  8. Convergence of IL Scripts • Aksharas are syllabic in nature • Common phonetic base • Share a common set of speech sounds across all languages • Fairly good (though not exact) correspondence between sequence of Aksharas and the corresponding sequence of sounds • Often referred to as Letter-to-sound rules • Written from left-to-right as in European languages • Words are separated by space as in European languages

  9. Divergence of IL Scripts • Each IL has its own script • All IL share a common phonetic base – however, Phonotactics in each IL are different from each other • IL are non-tonal languages unlike eastern languages such as Chinese

  10. How to represent Indian language Scripts • Unicode • Useful for *rendering* the Indian language scripts • Not suitable for keying-in through QWERTY key board • Not suitable to build modules such as text-normalization (can’t see the Unicode characters on many editors) • Itrans-3 / OM - A transliteration scheme by IISc Bangalore, India and Carnegie Mellon University • Useful for *keying-in and store* the scripts of Indian language using QWERTY keyboards • Useful for processing and writing modules/rules for letter-to-sound, text normalization etc.

  11. Itrans-3 / OM Notation

  12. Why Itrans-3/OM? • Developed from the user readability aspects – Easier to read and type • It is case-insensitive. • This scheme is phonetic in nature, the characters corresponds to the actual sound that is being spoken. • Thus a single transliteration scheme is used for all the Indian languages, as they share the same set of sounds. • Each character (corresponding to a phone/sound) is not more than three letters length. • Adapted across Universities in India/Abroad and some industrial labs such as Bhrigus Inc.

  13. Issues in TTS rendering in IL • TTS should be able to pronounce words as Akshara (syllable) by Akshara (syllable) • Languages have heavy influence of English (alien) words • Alien words occur in between the sentences • Each language has its own dialect

  14. SSML Tag: Phoneme Element <phoneme> • <phoneme alphabet="itrans-3" ph="n aa t oo"> naatoo </phoneme> • Ph attribute specifies phoneme/phone string • Rendering “n” “aa” “t” “oo” individually does not make sense to the native speakers of Indian languages • Sounds needs to be rendered in terms of syllables

  15. Syllable Element <syllable> • <syllable alphabet="itrans-3" syl="naa too"> naatoo </syallable> • Render “naa” and “too” which are Aksharas (syllables)

  16. Motivation for Loan Word <alien> • Informal experiments suggested 33% of errors of TTS of IL occur while rendering alien (non-native) words • Such alien words could be automatically detected due to syllabic properties of the Indian languages

  17. Example of loan word • BANK has to be pronounce as /B/ /AE/ /N/ /K/ • /AE/ phoneme does not exist in Indian language phone set • <alien> baank </alien> • Alien (non-native) words could be rendered using different pronunciation dictionaries or letter-to-sound rules

  18. Dialect Element <dialect> • Each language has its own dialect • TTS should be able to handle dialects without unloading the language resources

  19. Dialect Element <dialect> • <?xml version="1.0"?><speak version="1.0" xml:lang="tel-in"> • <voice gender="female"> • <dialect name = “andhra”> yekkad’iki vel’laali </dialect> • <dialect name = “telengana” pro = “yaad’iki poovaale”> yekkad’iki vel’laali </dialect> • </voice></speak>

  20. Conclusions • Bhrigus Inc. Hyderabad taking lead position to develop ASR and TTS for Indian languages • Proposed <syllable> <alien> <dialect> elements for SSML extensions

  21. Prahallad Lavanya, Prahallad Kishoreand GanapathiRaju Madhavi, A Simple Approach for Building Transliteration Editors for Indian Languages, Journal of Zhejiang University Science, vol.6A, no.11, pp. 1354-1361, Oct 2005. References

More Related