210 likes | 458 Views
SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML 30-31 May 2006, Greece. Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India. Topics. About Bhrigus Collaborative Efforts between Bhrigus and IIIT Hyderabad
E N D
SSML Extensions for TTS in Indian LanguagesII workshop on Internationalizing SSML 30-31 May 2006, Greece Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India
Topics • About Bhrigus • Collaborative Efforts between Bhrigus and IIIT Hyderabad • Nature of Indian language scripts – convergence and divergence • Issues across TTS rendering in all these languages • Proposed solutions/tags: • Syllable Element • Alien Element • Dialect Element
Bhrigus voice & data solutions http://www.bhrigus.com
About Bhrigus Established : 2002 Business : Providing IVR, Speech & Enterprise solutions to BFSI, Telco’s, contact centers & manufacturing companies. Key Customers : Hewitt Associates, AT&T, Pfizer, Merrill Lynch, Union pacific railroad, CDIA, South western energy, Orange county, Stryker • SEI CMM Level 4 Process Implementation undergoing, ISO 9001: 2000 – KPMG certified.
Speech and Language Technology Lab @ Bhrigus • Playing a leadership role in the development of ASR and TTS for all official Indian languages to provide voice solutions for Indian market • Collaborations: IIIT Hyderabad, & Carnegie Mellon University • 10 member team + board of advisors • 3 PhDs and 4 Masters • Synthesis team, Recognition team, Linguist team and Language resources team • Initiating SSML and VXML chapters in India
Collaborative Efforts • Bhrigus Inc. Hyderabad – Voice based solution providers • IIIT Hyderabad – one of the leading universities in India doing speech research • Telugu TTS – Collaborative Efforts between Bhrigus Inc. and IIIT • Goal: Develop ASR and TTS for all official Indian languages
Nature of Indian Language (IL) Scripts • Basic units of the writing system are Aksharas • An Akshara is an orthographic representation of a speech sound • Akshara is syllabic in nature, typical forms are V, CV, CCV and CCCV (C – consonant, V – vowel) • Always ends with a vowel (or nasalized vowel) in written form • ~1652 dialects/native languages • 22 languages officially recognized
Convergence of IL Scripts • Aksharas are syllabic in nature • Common phonetic base • Share a common set of speech sounds across all languages • Fairly good (though not exact) correspondence between sequence of Aksharas and the corresponding sequence of sounds • Often referred to as Letter-to-sound rules • Written from left-to-right as in European languages • Words are separated by space as in European languages
Divergence of IL Scripts • Each IL has its own script • All IL share a common phonetic base – however, Phonotactics in each IL are different from each other • IL are non-tonal languages unlike eastern languages such as Chinese
How to represent Indian language Scripts • Unicode • Useful for *rendering* the Indian language scripts • Not suitable for keying-in through QWERTY key board • Not suitable to build modules such as text-normalization (can’t see the Unicode characters on many editors) • Itrans-3 / OM - A transliteration scheme by IISc Bangalore, India and Carnegie Mellon University • Useful for *keying-in and store* the scripts of Indian language using QWERTY keyboards • Useful for processing and writing modules/rules for letter-to-sound, text normalization etc.
Why Itrans-3/OM? • Developed from the user readability aspects – Easier to read and type • It is case-insensitive. • This scheme is phonetic in nature, the characters corresponds to the actual sound that is being spoken. • Thus a single transliteration scheme is used for all the Indian languages, as they share the same set of sounds. • Each character (corresponding to a phone/sound) is not more than three letters length. • Adapted across Universities in India/Abroad and some industrial labs such as Bhrigus Inc.
Issues in TTS rendering in IL • TTS should be able to pronounce words as Akshara (syllable) by Akshara (syllable) • Languages have heavy influence of English (alien) words • Alien words occur in between the sentences • Each language has its own dialect
SSML Tag: Phoneme Element <phoneme> • <phoneme alphabet="itrans-3" ph="n aa t oo"> naatoo </phoneme> • Ph attribute specifies phoneme/phone string • Rendering “n” “aa” “t” “oo” individually does not make sense to the native speakers of Indian languages • Sounds needs to be rendered in terms of syllables
Syllable Element <syllable> • <syllable alphabet="itrans-3" syl="naa too"> naatoo </syallable> • Render “naa” and “too” which are Aksharas (syllables)
Motivation for Loan Word <alien> • Informal experiments suggested 33% of errors of TTS of IL occur while rendering alien (non-native) words • Such alien words could be automatically detected due to syllabic properties of the Indian languages
Example of loan word • BANK has to be pronounce as /B/ /AE/ /N/ /K/ • /AE/ phoneme does not exist in Indian language phone set • <alien> baank </alien> • Alien (non-native) words could be rendered using different pronunciation dictionaries or letter-to-sound rules
Dialect Element <dialect> • Each language has its own dialect • TTS should be able to handle dialects without unloading the language resources
Dialect Element <dialect> • <?xml version="1.0"?><speak version="1.0" xml:lang="tel-in"> • <voice gender="female"> • <dialect name = “andhra”> yekkad’iki vel’laali </dialect> • <dialect name = “telengana” pro = “yaad’iki poovaale”> yekkad’iki vel’laali </dialect> • </voice></speak>
Conclusions • Bhrigus Inc. Hyderabad taking lead position to develop ASR and TTS for Indian languages • Proposed <syllable> <alien> <dialect> elements for SSML extensions
Prahallad Lavanya, Prahallad Kishoreand GanapathiRaju Madhavi, A Simple Approach for Building Transliteration Editors for Indian Languages, Journal of Zhejiang University Science, vol.6A, no.11, pp. 1354-1361, Oct 2005. References