30 likes | 151 Views
Low footprint high quality Text-to-Speech. Ron Hoory. Dec 11, 2001. IBM Research Lab in Haifa. Concatenative Text to Speech. Current embedded TTS solutions - mostly "formant" based - with metallic unnatural sound. Concatenative TTS
E N D
Low footprint high quality Text-to-Speech Ron Hoory Dec 11, 2001 IBM Research Lab in Haifa
Concatenative Text to Speech • Current embedded TTS solutions - • mostly "formant" based - with metallic unnatural sound. • Concatenative TTS • speech segments are selected from a large database and concatenated together. • produces high quality speech, which sounds more natural and human. • currently part of IBM's "Websphere Voice Server" offering. • Implementation for embedded platforms • speech database size (> 50MB) is an obstacle. • HRL is currently working on low footprint concatenative TTS aimed to enable high quality TTS on embedded platforms.
Activity at HRL • Goals: • adapt the concatenative TTS technology developed at IBM T.J. Watson research center to low footprint operation. • speech database size: <5MB. • maintain the same level of quality as the server based TTS. • integrate to IBM's embedded ViaVoice offering. • Means: • speech segments are stored as compressed feature vectors, which may be reconstructed back to speech after concatenation. • novel HRL compression and reconstruction techniques used. • Contact points and IBM partners: • HRL Audio/Video group: Zohar Sivan, Ron Hoory. • IBM Voice Systems: Tom Rutherfoord.