December 19, 2005

FPMS December 19, 2005

Acapela’s corporate profile

Group Background • Babel Technologies • > Created in 1995 in Mons (Belgium) • > Spin off of Mons Polytechnical University • > In-house TTS & ASR technologies • > TTS and ASR leader in Embedded environment • Infovox • > Created in 1983 in Stockholm (Sweden) • > Spin off of KTH (Royal Institute of Technology) • > Integrated into Telia Promotor in 1993 • > Acquired by Babel Technologies in 2001 • > TTS leader in Nordic, Germany and Netherlands > Accessibility and Telecom expertise Elan Speech • > Created in 1980 in Toulouse (France) • > Focused on TTS since 1996 • > Launch of in-house high quality TTS in 2002 (Elan Sayso) • > TTS leader in Telecom and Automotive

Acapela’s locations Sweden, Stockholm 3 sites 50 people International Team Local support in each site Merged organization Belgium, Mons France, Toulouse

Acapela’s multilingual offer ASR & TTS components in 23 languages

Acapela’s technologies

Technologies (TTS) • Architecture Text Preprocessor Set of Rules Tagger Dictionary based Phonetizer Phonetic tree + Dictionary Prosody Prosodic Patterns database (Voice) Synthesizer

Text Preprocessor • > Function • Generation of standard text • > Examples • Numbers: 100  one hundred • Currencies: $20  twenty dollars • Abbreviations: tel.  telephone • > Implementation • Rules are defined in a standard format (BNF format) • > Size of data • 20 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth

Tagger (optional) • > Function • Generation of grammatical function of each word • Optional: not necessary for all languages • > Examples • To read – I have read • Les poules du couvent couvent • > Implementation • Dictionary based + set of rules • > Size of data • 0 to 20 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth

Phonetizer • > Function • Generation of phonetic transcription for each word • > Examples • Babel: b a b E l • > Implementation • Decision tree + exception dictionary • > Size of data (language dependent) • 5 to 350 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth

Prosodic module • > Function • Generation of intonation: • Phoneme duration • Pitch markers • > Examples • See MBROLI application • > Implementation • Prosodic patterns extracted from speech corpus • > Size of data (language dependent) • 30 to 300 Kbytes Text Prepro. Tagger Phonetizer Prosody Speech Synth

Text Prepro. Tagger Phonetizer Prosody Speech Synth. Synthesizer • > Function • Generation of speech samples from phoneme sequence + intonation • > Implementation: 3 technologies • Formant-based = rules • Diphone concatenation • Unit Selection • > Size of data: depends on • Technology • Sampling frequency • Compression rate • From 50 Kbytes to 50 Mb

Technologies (ASR) • Speech Recognition • Hybrid Models : Hidden Markov Models/ Neural Networks. Analyse Acoustique Reseau neurones Discrimination Programmation Dynamique (decoder) HMM

Reconnaissance • Vocabulaire • Transcription phonétique Ex: reconnaissance: R [@] k O n E s a~ s • Envisager toutes les transcriptions ! Ex: 10 = dis – diz – di • Envisager les synonymes ! Ex: Oui , ouais, ok, c’est cela, … Ex: Télévision, TV, poste de télévision

Reconnaissance (suite) : difficultés • Bruit • Accents • Hésitations • Utilisateurs • Syntaxe incorrecte • Mots hors vocabulaire

ASR : advantage of NN

Acapela’s product overview

Acapela’s Technologies Overview 3 Technologies • > High-Quality TTS : the pleasant and natural sounding voice • voice enabled by Sayso and BrightSpeech • based on Unit Selection technology • > High-Density TTS : the right choice for high density and small footprints • voice enabled by Tempo and Babil • based on Diphone technology • > ASR : the robust speech recognizer • voice enabled by Babear • Speaker Independent ASR based on Hidden Markov Models and Artificial Neural Networks

High Density TTS Voice enabled by Tempo & Babil Two TTS technologies • Diphone based concatenative TTS • Advantages • Small footprint (2 to 6 Mb) • Flexibility (Pitch, Speed adjustment, prosody copying) • High intelligibility • 21 languages supported Disadvantage : • Less natural sounding • Markets/Application targeted : • Automotive & consumer electronic (low footprint) • High density, short ROI server based TTS (telephony) • Multimedia software products

High Density TTS language availability

High Quality TTSVoice enabled by Sayso & BrightSpeech • Unit selection concatenative TTS • Advantages : • Very high quality • Highly natural • Flexibility (Pitch, Speed adjustment, timber alteration, whispering feature) • Support for Custom voice (“SpeechBrand” Program) Disadvantage : • larger footprint (16 to 70 Mb) • Markets/Application targeted : • High end telephony application (Voice portal, news) • New generation of navigation terminals • Public address

High Quality TTS language availability

ASR Voice enabled by Babear • Hybrid technology of Hidden Markov Models and Artificial Neural Networks • Advantages : • Very high accuracy in difficult contexts • High dialog flexibility, • lip-sync and language learning capabilities thru phoneme level discrimination • Speaker independent • Accurate Voice Activation for noisy environments • Markets/Application targeted : • Industrial Data collection : inventories, picking… • Automotive • Name dialing • Multimedia Command & Control / language learning

ASR language availability

Acapela’s market coverage

Acapela’s Markets Solutions for Telecom, Automotive, Accessibility Mobility, Industry, Multimedia, Consumer Electronics.

Acapela’s Markets Leading 3 major and mature markets Telecom, Automotive, Accessibility

Acapela’s main Markets • TelecomServer based vocalization of contents for multiple users over the phone • for Companies : Unified messaging, Auto attendant, CRM • for Telcos : Unified messaging, Voice portal, SMS2Voice, directory and reverse directory • for Contact centers: call automation, FAQ

Acapela’s main Markets • AutomotiveOn board and off-board speech solutions • On board & Off board car navigation systems • Traffic information • PDA based applications • Telematics

Acapela’s main Markets • Accessibility • Assistive technologies • Screen readers • Reading machines • Voice-controlled mobile phones

Acapela’s Markets Creating new speech markets opportunities in • >> Mobility • Cell phones • Navigation on PDAs

Acapela’s Markets Creating new speech markets opportunities in • >> Industry • Public Address • Alarm & Supervision • Warehousing, Production Line

Acapela’s Markets Creating new speech markets opportunities in • >>Multimedia • Edutainment • Education • Language learning • E-learning

Acapela’s Markets Creating new speech markets opportunities in • >> Consumer Electronics, … • Talking dictionaries devices • Toys

giving you the say

December 19, 2005

December 19, 2005

Presentation Transcript

2 December 2005

DECEMBER 2005

December 2005

December 3, 2005

December 15, 2005

December 2005

London 19 December, 2005

London December 2005

December 2005

The UNDP Programme in 2005 Annual Review Meeting December 19, 2005

December 3, 2005

December 19, 2005

December, 2005

December 13, 2005

Brussels , December 2005

December 16, 2005

December 20, 2005

DECEMBER 16, 2005

December 1, 2005

7th December 2005

December 2005

December 20. 2005