1 / 28

Speech in .NET

Speech in .NET. Sphinx CMU November 2002. Presenter. casey chesnut brains-N-brawn.com Web Services Mobile / Wireless Speech. Audience. Java / C++ / VB / C# ? VoiceXml ? SALT / Speech .NET ?. Outline. MS Technologies VoiceXml Demo Speech .NET Demo Future Questions (throughout)

Download Presentation

Speech in .NET

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech in .NET Sphinx CMU November 2002

  2. Presenter • casey chesnut • brains-N-brawn.com • Web Services • Mobile / Wireless • Speech

  3. Audience • Java / C++ / VB / C# ? • VoiceXml ? • SALT / Speech .NET ?

  4. Outline • MS Technologies • VoiceXml • Demo • Speech .NET • Demo • Future • Questions (throughout) • ~25 slides

  5. MS Technologies • Tools • Devices • Phone • Desktop PC • Pocket PC • Tablet PC

  6. Tools • MS Agents • SAPI / Speech SDK 5.1 (.NET wrappable) • Office • AutoPC ??? • ASP .NET (VoiceXml) • (beta) Speech .NET / IE Speech Add-In • … SALT Telephony gateway (early 2003) • … Pocket IE Speech Add-In (mid 2003)

  7. Devices • Phone • billions of devices, people are comfortable speaking to • Desktop PC • large market, speech input is slower and uncomfortable • Pocket PC • small market, opportunities for speech (device limitations) • Tablet PC • new market, speech friendly (slate models don’t have keyboards)

  8. Phone • ASP .NET w/ VoiceXml 2.0 • Production quality now • Multiple vendor support • Speech .NET VoiceOnly • Currently no way to deploy and test over a phone • Speech .NET Beta 2 has telephony simulation • MS target market for Speech .NET

  9. Desktop PC • Web • Speech .NET MultiModal • Beta 2 IE Speech Add-In • Embedded control w/SAPI • MS Agents • Fat • SAPI • MS Agents

  10. Pocket PC • Web • SALT Pocket IE Speech Add-Ins (mid 2003) • Fat • 3rd parties only • MS Reader does not support TTS

  11. Tablet PC - TODAY! • Web • … same as desktop PC • Beta 2 has added support for Tablet PC • Virtual keyboard has speech control • Fat • … same as desktop PC • Virtual keyboard has speech control • MS Reader should be able to support TTS • Digital Ink is currently more compelling to MS

  12. VoiceXml • XML-based language • Declarative – XML tags, grammars • Procedural – Javascript • Telephony Gateway is the client • Event driven – Bargein, Goodbye • Object oriented – Properties

  13. Usage • Input • Speech Recognition (Command and Control) • DTMF • Voice recording and posting to a server • Output • Text-To-Speech • Prerecorded audio files • Telephony control • Hang-up, Transfers, …

  14. Architecture

  15. VoiceXml • DEMO • /vxml (VS.NET) • Mobile ADK (menu1.aspx) • BeVocal

  16. VoiceXml - SALT • VoiceXml : ??? : : SALT : Speech .NET • Nuance has some WYSIWYG • SALT is considered lightweight to VoiceXml • SALT was submitted to W3C August 2002 • VoiceXml is v2.0 in W3C • Mandatory W3C grammar spec • Beta 2 Speech .NET has moved to W3C SRGS • VoiceXml has complementary specs (ccXml) • VoiceXml is moving to MultiModal as well

  17. VoiceXml - SALT • VoiceXml = AT&T, Motorola, TellMe, (IBM) • SALT = MS, SpeechWorks, Intel, (BeVocal) • VoiceXml has multiple vendor support with venture capital from before the burst • Most vendors will support both specs • VoiceXml has ~ 15,000 developers • SALT has potentially millions

  18. SALT • I have not read the new spec  • Remember doing an in-head mapping to VoiceXml when reading an early spec • Why • Common spec for MultiModal operation • Multiple modes of interaction with the same syntax • Speech enabling existing sites • Why not VoiceXml • MultiModal retrofit harder than redo

  19. Speech .NET • MS implementation of SALT • (VoiceWebSolutions + DreamWeaver MX) • Some Beta 1 Speech .NET apps still work, because SALT has not changed much, but Speech .NET Beta 2 controls have • VoiceXml not as portable between vendors as it should be, the Speech .NET controls could help mitigate this for SALT • i.e. layer of abstraction for voice browser wars

  20. Architecture

  21. Code • Creating static grammars and prompts • Very little server-side code • Only dynamic grammars / prompts • Server-side code mods to better support speech • Mainly setting properties on Speech controls and tying to client-side javascript • Tie javascript to mouse-click events to avoid redundant code

  22. Impression • Separate app layers to reduce complexity • Voice UI will be less functional, design is key • Learning low level SALT might be easier than high level Speech .NET controls • Application controls change this in Beta 2 • Speech .NET has a great debugger (now server side too), grammar, and prompt tools • Speech Control Editor was needed for dev • IE Audio meter was needed for MultiModal • MultiModal has some time to grow

  23. Speech .NET • DEMO • Speech .NET Beta 2 (VS .NET) • /noHands (VoiceOnly web app)

  24. Industry • Wrote 1st VoiceXml article a year ago • Received 1st proposal request last month • 1 other proposal request since then • Wrote 1st Speech .NET article 5 months ago • Request for an article from MSDN magazine

  25. Voice Recognition • PSTN is less secure than Internet! • More accessible and easier to automate hack • Traditionally spoken password OR DTMF pin, also # • Clients always confuse with speech recognition • Not a part of VoiceXml or SALT specs • Telephony gateways proprietary implementations • Not useful for identifying somebody • Useful for confirming somebody is whom they say they are • Prints have to change when device changes

  26. Future (MS Speech) • SALT Telephony gateways • Speech .NET (VoiceOnly then MultiModal) • Pocket IE Speech Add-In • NET Fat-client Speech APIs • Desktop / Tablet / PPC • MS or 3rd party VS .NET VoiceXml controls • Possibility for Speech .NET controls to render both SALT and VoiceXml

  27. Future • Lots of W3C Voice specs … • VoiceXml MultiModal browser • Auto (hands-free, navigation, radio) • 3G (bridge voice and wireless web) • offload Speech processing • VOIP or PSTN • Pocket PC Phone Edition / SmartPhones • IBM recently announced chip for Speech on mobile devices

  28. Questions

More Related