Speech Technology Group Cambridge Research Lab Toshiba Research Europe Ltd

Speech Technology GroupCambridge Research LabToshiba Research Europe Ltd Kate Knill Manager, Interaction Technology kate.knill@crl.toshiba.co.uk 12 January 2010

Toshiba • World leader in high technology • 3 key areas: • Digital media • Electronic devices and components • Social infrastructure systems • 197,000 employees worldwide • Sales over US$70billion • Strong ecological commitment

Toshiba R&D: Toward the Innovation Driven Company • Subline und Fliesstexte in Helvetica Neue 24 Light • Ein Aufzählungszeichen ist auch möglich TARI Branch Office in Silicon Valley San Jose Toshiba China R&D Center Peking Toshiba Corporate R&D Center Toshiba America Research, Inc. Piscataway, New Jersey Toshiba Research Europe Limited ◆Cambridge Research Laboratory (CRL) ◆Telecommunications Research Laboratory Bristol

Toshiba Cambridge Research Lab Established 1991 – Semiconductor Physics for the 21st Century • Quantum Information • Nano-biotechnology Speech Technology Group added 2002 Computer Vision Group added 2006

Toshiba Speech and Language R&D Toshiba Research Europe Ltd, Cambridge Toshiba China R&D, Beijing Toshiba Corporate R&D Center, Kawasaki

CRL Speech Technology Group • Focus on embedded ASR and TTS • Core technology research and development • Noise and speaker robustness • LVCSR • HMM-TTS • European and North American languages • Approx 15 researchers • Multinational team • Mix of engineers, computer scientists and linguists Toshiba China R&D, Beijing Toshiba Corporate R&D Center, Kawasaki

Vision of Toshiba Speech Research • Enhance the human-machine interface • Interact with devices how, when and where you want • Create a paradigm shift • Input/output  communication

Speech Recognition Challenges Task Robustness Noise Robustness Speaker Robustness • Current ASR engines still suffer from lack of robustness • Major limitation in deploying speech recognition systems

Text-to-Speech Synthesis Challenges • Increase in naturalness of synthesis • Same or even smaller footprint! • Increase in voice variety • Faster, cheaper addition • Non-professional voices neutral emotional friendly expressive large corpus professional voice small corpus professional voice small corpus amateur voices

Toshiba in SCALE: Second Supervisor • Recognition • Kate Knill • KK Chin • Projects: • RS-3 Hierarchical Trajectory Models for Speech Recognition, Heyun Huang, Lou Boves • AHSR-2 Data Association Multisource Acoustic Models, Liang Lu, Steve Renals • Synthesis • Heiga Zen • Projects: • RS-1 Trajectory HMMs for Reactive Speech Synthesis, Cassia Valentini, Simon King • RS-4 Speech Synthesis by Analysis, Mauro Nicalao, Roger Moore

Speech Technology Group Cambridge Research Lab Toshiba Research Europe Ltd