1 / 19

Rhetorical Group plc

Rhetorical Group plc. Marc Moens January 2001. r. Focus of Rhetorical Group. Speech Synthesis Producing natural sounding speech Talking computers Voices modelled on human voices and often almost indistinguishable from the original voice Different from speech recognition

ezra-wynn
Download Presentation

Rhetorical Group plc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rhetorical Group plc Marc Moens January 2001 r

  2. Focus of Rhetorical Group • Speech Synthesis • Producing natural sounding speech • Talking computers • Voices modelled on human voices and often almost indistinguishable from the original voice • Different from speech recognition • Used in dictation, voice-controlled operating systems • Different companies, targeting different markets • Core product: rVoice r

  3. Technological breakthrough • Old technology: • Formant-based synthesis • Diphone-based synthesis • Lack vitality • Monotonic • Not suitable for extended use r

  4. Technological breakthrough • rVoice: • Unit selection • More natural sounding • Suitable for extended use • In many applications: almost indistinguishable from a human voice • “Welcome to our new speech synthesiser.” r

  5. Speech synthesis • Human voice vs Synthesised Voice • Under controlled conditions • Mixing the human voice with the synthesised voice • “Previously he was vice president of Eastern Edison.” • “Mrs Hill said many of the 25 countries that she placed under varying degrees of scrutiny had made genuine progress on this touchy issue.” r

  6. Senior Management: Marc Moens (CEO) Paul Taylor (CTO) Peter Denyer (Chairman) Other management: Keith Edwards (applications manager) Ian Hodson (product development manager) Art Blokland (consultancy manager) Full Team: 35 people Rhetorical Team

  7. rVoice outlook • A variety of applications and platforms: • Telephony industry • Games • Internet • Mobile communications • A variety of input mechanisms • Text (TTS) • Concept to speech – in conjunction with language generation • Domain specific applications • A variety of voices and languages • rVoice rapid voice prototyper allows new voices to be added to the system in a matter of weeks • Different accents and languages covered • All within a single generic system

  8. rVoice core capabilities: domain specific synthesis • Flexible, scalable domain specific synthesis • Airline information • Car directions • Financial news

  9. rVoice core capabilities: multi-linguality • Currently only English available • Plans: • German and French by Q2 2001 • Spanish Q3 • Dutch and Italian Q4 • Same engine for all languages

  10. rVoice core capabilities: text analysis • Robust statistical: • Text normalisation ($1.43 > one dollar forty three cents) • POS tagging • Phrase break prediction • Letter-to-sound rule transduction (including automatic training) • Syntactic parsing

  11. Talking Heads • Ongoing work on rFace • Ability to capture 3D model of any head, and combine it with speech

  12. System Overview • Two systems: • rVoice developer • single user stand alone system with scripting language and graphical tools • rVoice run • compact fast run-time system, multithreaded. Client server architecture and telephony hardware communications

  13. Current Platforms • Solaris 2.5, 2.5.1, 2.6, 2.7: FreeBSD 2.2, 3.x • Linux (Redhat 4.1, 5.0, 5.1, 5.2, 6.0 and other Linux distributions), OSF (Dec Alphas) SGI (Irix), HPs (HPUX). • Windows 95, 98, NT 4.0, 2000: Visual C++ v5.0 and v6.0

  14. Speed and Size • rVoice 1.0 aims: • 10 simultaneous channels on Pentium 1GHz • 256M Ram of which • 15M taken up by each channel • 75M of shared resource • Higher number of channels available with proportionate voice quality reduction

  15. System Development Schedule • Basic Prototype • January 31, 2001 • Alpha release • February 28, 2001 (single thread) • Beta release • April 15, 2001 (multi-threaded) • Full release • May, June 2001

  16. Development Schedule: capabilities • First basic British voice: • 6th December 2000 • Five British voices: • end January 2001 • Five American voices: • mid February 2001 • VoiceXML • end January 2001 • Fast unit selection • end January 2001

  17. Future Plans • Extension to new platforms including games and mobile devices • Development and integration with • language generation • information extraction and retrieval

  18. Contact Peter Denyer Rhetorical Group plc 4, Buccleuch Place Edinburgh EH8 9LW Tel : 07770 416 699 Fax: 0131 650 4587 Email: P_Denyer@yahoo.com Marc Moens Rhetorical Group plc 4, Buccleuch Place Edinburgh EH8 9LW Tel : 0131 650 4427 07979 596770 Fax: 0131 650 4587 Email: marc@rhetoricalsystems.com r

  19. Make rVoice your voice. r

More Related