Speech Technology

Speech Technology Speaker Prof. B.B.chaudhuri Indian Statistical Institute, Kolkatta bbc@isical.ac.in Indo-German Workshop on Language Technologies AU-KBC Research Centre, Chennai

Speech Group Language Technologies Research Center International Institute of Information Technology Hyderabad sangal@iiit.net Speech group at Language Technologies Research Centre (LTRC) focuses on building Text to Speech (TTS) systems and Automated Speech Recognition (ASR) systems for Indian languages. We have built vary natural sounding speech synthesis systems for Hindi and Telugu. There are a number of applications built using the core engine, like, travel aid software on a Simputer, Reading Aid software for Visually Impaired (RAVI). Work is under progress for developing TTS engines for other Indian languages. Some of the work done is being described below. The research work in the area pf speech recognition is initiated recently in collaboration with Carnegie Melon University.

Language Engineering ResearchatResource Centre for Indian Language Technology SolutionsUniversity of HyderabadDr. K. Narayana Murthy Department of Computer and Information SciencesUniversity of Hyderabad, Hyderabad - 500 046knmuh@yahoo.com

So far • OCR for Telugu and other Indian Languages • Experimental Text-to-Speech System for Telugu • A Variety of tools

Planned • Speech Technologies • Speech Recognition • Text-to-Speech • Long Term Vision: Speech-to-Speech Translation between English and ILs

English Speech Telugu Speech Telugu Speech English Speech English ASR Telugu ASR Telugu TTS English TTS English-Telugu MT Telugu-English MT English Text Telugu Text Telugu Text English Text English – Telugu – English Speech to Speech Translation

What we have Developed: Telugu • Telugu Corpus (10 Million words) • English-Telugu, Telugu-Hindi dictionaries • Telugu Morphological Analyzer • OCR System for Telugu • Telugu Spell Checker • Telugu TTS systems • Electronic versions of several dictionaries

Other Contributions • AKSHARA – Advanced Multi-Lingual Text Processor • VIDYA – Web Based Education system • History-Society-Culture Portal • On-Line Searchable Directory • Language Technology Tool Kits www.LanguageTechnologies.ac.in

Efforts in Language & Speech Technology Natural Language Processing Lab Centre for Development of Advanced Computing (Ministry of Communications & Information Technology) ‘Anusandhan Bhawan’, C 56/1 Sector 62, Noida – 201 307, India karunesharora@cdacnoida.com

Annotated Speech Corpora for Hindi, Punjabi and Marathi languages Vishleshika Statistical AnalysisTool Gyan Nidhi Corpus Phonetically Rich sentence set Manual Verification and Editing Studio Recording by Professionals XML Meta Data Creation Segmentation and labeling using Praat / Emulabel

Module Description TTS Shell TTS shell is multi-threaded interface that call different TTS modules and returns messages that user can process to generate different events. Voice Builder It is a utility that helps in building syllable database. It reduces the space utilization and helps in performing fast search. Query Tool for Voice Builder Tool for reading voice file and retrieving the information about the “UNIT” from the file i.e.: Wave Data. Text Parser This unit breaks the Normalized text into logical units like: Sentences, Words and Syllables etc Prosody Matching & Syllable concatenation “PSOLA” technique for smooth joining of speech samples is being followed Synthesizer Function: For writing wave data directly onto a sound card or wave file. Modules under TTS

Other Areas of expertise • OCR for Devanagri Script • Digital Library for Indian languages • Word Processing tools like Spell Checker, Transliteration, Terminology Development, Document analysis, Font converters • Indian Language eContent Creation

Areas for future work • Speech Technology • Speech to Speech Translation System • Development of Semi-automated speech annotation tools

Utkal UniversityWe Work On Image Processing Speech Processing Knowledge Management

Image Processing • (A) Optical Character Recognition (OCR): • Ø Converts scanned Oriya content to text • Ø An OCR with TTS - DIVYADRUSTI • Ø User - Press, Media & Educational Institutes • Ø Operates in command mode so useful for Illiterate and Visually Challenged. • Ø Got IPR & Tested by SQTC, ETDC Banglore. • (B)English Reader System • ØUses English OCR System • ØIntegrated with Microsoft Text To Speech Engine • Ø And Speech To Text Engine to operate in command mode.

Speech Processing • (A) Text To Speech (TTS) System: • ØSpeech synthesizer for Oriya language is designed by the character based concatenation technique. • ØThe transition between two characters is stored by taking the help of Paninian philology to give a natural shape to the output. • ØIn addition to Oriya language we are in the process of developing a TTS system for Hindi in syllable base concatenation • ØGot IPR & Tested by SQTC, ETDC Banglore • (B) Speech To Text (STT) system: • ØRecognising words, designed through a training process of phones, diphones and triphones. • ØTelephone Directory system based on Oriya character recognition system. • Ø Applied for IPR.

Present Interest • Hand Written Recognition of old scripts using Neural Network • OCR for Brhami Script • Automatic Speech Recognition System • Speaker Recognition and Accent Analysis using HMM • TTS for other languages (Indian) to make Reader System

Research Activities Department of Computer Science & Engineering College of Engineering, Guindy Chennai – 600025 Participant : Dr.T.V.Geetha Other members: Dr. Ranjani Parthasarathi Ms.D. Manjula Mr. S. Swamynathan

Natural Language Processing, Speech Processing & Knowledge RepresentationWork done in the area • Text to Speech Engine • A preliminary version is available. Efforts are on to improve the quality of speech produced using diphones • Indian Logic for Conceptual Ontology • Ontology based on Nyaya shastra, an Indian Logic System • Notion of associating qualities and values to definition of concepts and adding different kinds of negation, brings a new perspective to the interpretation of knowledge. • Conceptual Model, Extended Description Logic, Evaluation Model • Knowledge representation system called KRIL has been implemented.

Natural Language Processing, Speech Processing & SummarizationPossible Areas of cooperation • Global phonemes neural based approach to Multi language Speech analysis - Recurrent neural network approach • Concatenation based Text-to-Speech - diphones • Semantic Fragment Extraction based Text Summarization

Natural Language Processing, Translation Support Systems - Possible Areas of cooperation • Tamil Sentence generator • Incorporation of grammatical structures to facilitate sentence formation • Design of a format to be given as input to sentence generator • Generation of complex sentences • Tamil Parser and Semantic Analyzer • Tackling of complex grammatical structures • Case based semantic analysis of simple sentences • Tackling of ambiguous and incorrect sentences by the parser

Speech Technology