350 likes | 494 Views
FPMS. December 19, 2005. Acapela’s corporate profile. Group Background. Babel Technologies > Created in 1995 in Mons (Belgium) > Spin off of Mons Polytechnical University > In-house TTS & ASR technologies > TTS and ASR leader in Embedded environment. Infovox
E N D
FPMS December 19, 2005
Group Background • Babel Technologies • > Created in 1995 in Mons (Belgium) • > Spin off of Mons Polytechnical University • > In-house TTS & ASR technologies • > TTS and ASR leader in Embedded environment • Infovox • > Created in 1983 in Stockholm (Sweden) • > Spin off of KTH (Royal Institute of Technology) • > Integrated into Telia Promotor in 1993 • > Acquired by Babel Technologies in 2001 • > TTS leader in Nordic, Germany and Netherlands > Accessibility and Telecom expertise Elan Speech • > Created in 1980 in Toulouse (France) • > Focused on TTS since 1996 • > Launch of in-house high quality TTS in 2002 (Elan Sayso) • > TTS leader in Telecom and Automotive
Acapela’s locations Sweden, Stockholm 3 sites 50 people International Team Local support in each site Merged organization Belgium, Mons France, Toulouse
Acapela’s multilingual offer ASR & TTS components in 23 languages
Technologies (TTS) • Architecture Text Preprocessor Set of Rules Tagger Dictionary based Phonetizer Phonetic tree + Dictionary Prosody Prosodic Patterns database (Voice) Synthesizer
Text Preprocessor • > Function • Generation of standard text • > Examples • Numbers: 100 one hundred • Currencies: $20 twenty dollars • Abbreviations: tel. telephone • > Implementation • Rules are defined in a standard format (BNF format) • > Size of data • 20 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth
Tagger (optional) • > Function • Generation of grammatical function of each word • Optional: not necessary for all languages • > Examples • To read – I have read • Les poules du couvent couvent • > Implementation • Dictionary based + set of rules • > Size of data • 0 to 20 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth
Phonetizer • > Function • Generation of phonetic transcription for each word • > Examples • Babel: b a b E l • > Implementation • Decision tree + exception dictionary • > Size of data (language dependent) • 5 to 350 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth
Prosodic module • > Function • Generation of intonation: • Phoneme duration • Pitch markers • > Examples • See MBROLI application • > Implementation • Prosodic patterns extracted from speech corpus • > Size of data (language dependent) • 30 to 300 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth
Text Prepro. Tagger Phonetizer Prosody Speech Synth. Synthesizer • > Function • Generation of speech samples from phoneme sequence + intonation • > Implementation: 3 technologies • Formant-based = rules • Diphone concatenation • Unit Selection • > Size of data: depends on • Technology • Sampling frequency • Compression rate • From 50 Kbytes to 50 Mb
Technologies (ASR) • Speech Recognition • Hybrid Models : Hidden Markov Models/ Neural Networks. Analyse Acoustique Reseau neurones Discrimination Programmation Dynamique (decoder) HMM
Reconnaissance • Vocabulaire • Transcription phonétique Ex: reconnaissance: R [@] k O n E s a~ s • Envisager toutes les transcriptions ! Ex: 10 = dis – diz – di • Envisager les synonymes ! Ex: Oui , ouais, ok, c’est cela, … Ex: Télévision, TV, poste de télévision
Reconnaissance (suite) : difficultés • Bruit • Accents • Hésitations • Utilisateurs • Syntaxe incorrecte • Mots hors vocabulaire
Acapela’s Technologies Overview 3 Technologies • > High-Quality TTS : the pleasant and natural sounding voice • voice enabled by Sayso and BrightSpeech • based on Unit Selection technology • > High-Density TTS : the right choice for high density and small footprints • voice enabled by Tempo and Babil • based on Diphone technology • > ASR : the robust speech recognizer • voice enabled by Babear • Speaker Independent ASR based on Hidden Markov Models and Artificial Neural Networks
High Density TTS Voice enabled by Tempo & Babil Two TTS technologies • Diphone based concatenative TTS • Advantages • Small footprint (2 to 6 Mb) • Flexibility (Pitch, Speed adjustment, prosody copying) • High intelligibility • 21 languages supported Disadvantage : • Less natural sounding • Markets/Application targeted : • Automotive & consumer electronic (low footprint) • High density, short ROI server based TTS (telephony) • Multimedia software products
High Quality TTSVoice enabled by Sayso & BrightSpeech • Unit selection concatenative TTS • Advantages : • Very high quality • Highly natural • Flexibility (Pitch, Speed adjustment, timber alteration, whispering feature) • Support for Custom voice (“SpeechBrand” Program) Disadvantage : • larger footprint (16 to 70 Mb) • Markets/Application targeted : • High end telephony application (Voice portal, news) • New generation of navigation terminals • Public address
ASR Voice enabled by Babear • Hybrid technology of Hidden Markov Models and Artificial Neural Networks • Advantages : • Very high accuracy in difficult contexts • High dialog flexibility, • lip-sync and language learning capabilities thru phoneme level discrimination • Speaker independent • Accurate Voice Activation for noisy environments • Markets/Application targeted : • Industrial Data collection : inventories, picking… • Automotive • Name dialing • Multimedia Command & Control / language learning
Acapela’s Markets Solutions for Telecom, Automotive, Accessibility Mobility, Industry, Multimedia, Consumer Electronics.
Acapela’s Markets Leading 3 major and mature markets Telecom, Automotive, Accessibility
Acapela’s main Markets • TelecomServer based vocalization of contents for multiple users over the phone • for Companies : Unified messaging, Auto attendant, CRM • for Telcos : Unified messaging, Voice portal, SMS2Voice, directory and reverse directory • for Contact centers: call automation, FAQ
Acapela’s main Markets • AutomotiveOn board and off-board speech solutions • On board & Off board car navigation systems • Traffic information • PDA based applications • Telematics
Acapela’s main Markets • Accessibility • Assistive technologies • Screen readers • Reading machines • Voice-controlled mobile phones
Acapela’s Markets Creating new speech markets opportunities in • >> Mobility • Cell phones • Navigation on PDAs
Acapela’s Markets Creating new speech markets opportunities in • >> Industry • Public Address • Alarm & Supervision • Warehousing, Production Line
Acapela’s Markets Creating new speech markets opportunities in • >>Multimedia • Edutainment • Education • Language learning • E-learning
Acapela’s Markets Creating new speech markets opportunities in • >> Consumer Electronics, … • Talking dictionaries devices • Toys