1 / 20

Speech-to-Speech Translation: A New Direction for the Speech Industry

Mark Seligman, CEO. Speech-to-Speech Translation: A New Direction for the Speech Industry. SpeechTEK West February 21-23, 2007.

willow
Download Presentation

Speech-to-Speech Translation: A New Direction for the Speech Industry

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mark Seligman, CEO Speech-to-Speech Translation: A New Direction for the Speech Industry SpeechTEK West February 21-23, 2007

  2. Converser for Healthcareis the world’s first commercially available speech-to-speech translation system for wide-ranging conversations. (Input via handwriting, touchscreen, and keyboard is also enabled.) Converser for Healthcareis an affordable, reliable, portable translation system which can improve communication 24/7 between healthcare workers and patients with limited English proficiency.

  3. Overview • Automatic Spoken Language Translation (SLT) • an age-old dream • Practical SLT systems are now coming into use • … but users must cooperate and compromise • History: three classes of SLT systems • categorized by degree of user cooperation and linguistic or topical coverage • Demo • Market • Commercial and research activity

  4. Star Trek? Not! • The goal: speak as usual • freely shift topics • full range of vocabulary, idioms, structures • spontaneous language: fragments, false starts, hesitations • mumble • converse in noisy environments • ignore the translation program • For now: some cooperation, compromise

  5. The scientific problem:component integration • Component technologies (SR, MT, TTS) • imperfect, hard to integrate • Each is usable, but combination may fall below usefulness threshold • error rates combine, compound

  6. Phraselator by VoxTec Class One • Class One: voice-driven phrase book • linguistic coverage: narrow • topical coverage: narrow • cooperation required: low • Fixed expressions or templates only • “I’d like a bottle of [beer, wine, soda], please.” • “I’d like a bottle of [BEVERAGE], please.” • Advantages for user • no need to carry a book -- use telephone • selection of phrase by voice rather than finger • translation output pronounced by native • Technology • Speech recognition: IVR • MT: flat lookup, template or example-based • Engineering exercise: low risk

  7. Other Class One • Sony TalkMan • Pending entries • Sharp • NEC • Future IVR systems?

  8. Class Two • Class Two: robust speech translation in narrow domains • linguistic coverage: broad • topical coverage: narrow • cooperation required: medium • Examples • Uh, could I reserve a double room for next Tuesday, please? • I need to, um, I need a double room please. That’s for next Tuesday. • Hello, I’m calling about reserving a room. I’d be arriving next week on Tuesday. • Advantages • Lots of experience • Can optimize SR, MT: special grammars (patterns) • Interlingua possible for MT • Challenges • Robust parsing still imperfect, so MT input is dirty • Some user frustration inevitable, but balanced by freedom • Risk: medium

  9. Class two: Worldwide Research • CMU/Univ Karlsruhe (USA/Germany) • ATR (Japan) • IRST (Italy) • ETRI (Korea) • GETA-CLIPS (France) • CAS-NLPR (China) • IBM (USA)

  10. Class two: Research

  11. Class Three • Class three: highly interactive speech translation with broad linguistic and topical coverage • linguistic coverage: broad • topical coverage: broad • cooperation required: extensive • User achieves broad coverage by supervising • SR: need dictation for broad coverage • MT: need broad coverage, good quality • Must be modifiable to enable interactive correction

  12. In the beginning … • French: Qu’est-ce que vous étudiez? • (What do you study?) • English: Computer science. • (L’informatique.) • French: Qu'est-ce que vous faites plus tard? (What are you doing later?) • English: I'm going skiing. • (Je vais faire du ski.) • French: Vous n'avez pas besoin de travailler? • (You don't need to work?) • English: I'll take my computer with me. • (Je prendrai mon ordinateur avec moi.) • French: Où est-ce que vous mettrez l'ordinateur • pendant que vous skiez? • (Where will you put the computer while you ski?) • English: In my pocket. • (Dans ma poche.)

  13. Converser Features

  14. Demo

  15. 200,000 potential customers Healthcare venues 6,003 hospitals (2003 www.USNews.com) 836,156 physicians (2001 www.ama.com) 15-20 minutes/meeting $45-$150/hour for human interpreter Market: U.S. Healthcare

  16. Operational significant ROI 24/7 access to interpreting reduced patient waiting time more efficient use of employees (keep staff in their positions) patient SAFETY (real and perceived) reduced liability: bilingual transcripts of interaction with patients compliance Communication benefits privacy more verifiability, consistency than with human interpreter Informed consent Value Proposition

  17. IDC Cross-language software: $67 billion (2000) to $237 billion (2005) Worldwide e-business globalization support: > $540 billion Multilingual communications, collaboration tools: $5 billion (by 2008) Allied Business Intelligence, Inc. Worldwide human translation: $5.7 billion (in 2006) Global Reach 70%+ of online population not native English WorldwideMarket

  18. Defense and Security services, intelligence, allies law enforcement Travel and Tourism Language Instruction/Education Government Service immigration welfare, food stamps, etc. Business B2C: customer service B2B: multinational firms, global partners/operations Consumer online affinity/personal portals (e.g. online dating) Markets

  19. Some Current Research/Commercial Activity • Spoken Translation, Inc. (Converser) • IBM (Mastor) • Sehda (S-Minds) • SpeechGear (Compadre Interpreter) • VoxTec (Phraselator) • Sony/Sharp/NEC (tourist) • Ectaco (Dictionary +) • MIT (flight domain) • CMU (Arabic for military) • BBN (Arabic for military)

  20. Thank you! To view demo visit: www.ConverserforHealthcare.com

More Related