1 / 22

A Standard for Developing Multimodal Applications

A Standard for Developing Multimodal Applications. James A. Larson Larson Technical Services jim @ larson-tech.com SpeechTEK West February 23, 2007 . Status of W3C Multimodal Interface Languages. Recommendation. Voice XML 2.0. Speech Recog- nition Grammar Format (SRGS) 1.0. Speech

avon
Download Presentation

A Standard for Developing Multimodal Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Standard for Developing Multimodal Applications James A. LarsonLarson Technical Servicesjim @ larson-tech.com SpeechTEK WestFebruary 23, 2007

  2. Status of W3C Multimodal Interface Languages Recommendation Voice XML 2.0 Speech Recog- nition Grammar Format (SRGS) 1.0 Speech Synthesis Markup Language (SSML) 1.0 Semantic Interpret- ation of Speech Recog- nition (SISR) 1.0 Proposed Recommendation Voice XML 2.1 Candidate Recommendation Last Call Working Draft Extended Multi- modal Interaction (EMMA) 1.0 Working Draft State Chart XML (SCXML) 1.0 InkXL 1.0 Requirements Developing & Delivering Multimodal Applications

  3. SALT Object- oriented Interaction Manager (XHTML) Interaction Manager (C#) SALT SAPI 5.3 Interaction Manager Approaches X+V W3C Interaction Manager (XHTML) Interaction Manager (SCXML) Data Model VoiceXML 2.0 Modules XHTML VoiceXML 3.0 InkML Developing & Delivering Multimodal Applications

  4. Object- SALT oriented X+V W3C Standard XHTML SRGS VoiceXML SCXML Languages SRGSSSML SRGS SRGS SSML SSML VoiceXML SISR SSML XHTML SISR XHTML EMMA CCXML Interaction XHTMLC# XHTML SCXML Manager Modes GUI GUI GUI GUI Speech Speech Speech Speech Ink … Developing & Delivering Multimodal Applications

  5. MMI Architecture—Basic Components • Interaction Manager—coordinates modality components and provides application flow • Modality Components—provide modality capabilities such as speech, pen, keyboard, mouse • Data Model—handles shared data Interaction Manager (SCXML) Data Model XHTML VoiceXML 3.0 InkML Developing & Delivering Multimodal Applications

  6. Multimodal Architecture and Interfaces • A loosely-coupled, event-based architecture for integrating multiple modalities into applications • All communication is event-based • Based on a set of standard life-cycle events • Components can also expose other events as required • Encapsulation protects component data • Encapsulation enhances extensibility to new modalities • Can be used outside a Web environment Interaction Manager (SCXML) Data Model XHTML VoiceXML 3.0 InkML Developing & Delivering Multimodal Applications

  7. Specify Interaction Manager Using Harel State Charts Prepare State • Extension of state transition systems • States • Transitions • Nested state-transition systems • Parallel state-transition systems • History Prepare Response (fail) Prepare Response (success) Start State StartFail Start Response FailState WaitState DoneFail Done Success EndState Developing & Delivering Multimodal Applications

  8. State Chart XML (SCXML) … <state id="PrepareState"> <send event="prepare" contentURL="hello.vxml"/> <transition event="prepareResponse" cond="status='success'" target="StartState"/> <transition event="prepareResponse" cond="status='failure'" target="FailState"/> </state> … Example State Transition System Prepare State Prepare Response (fail) Prepare Response (success) Start State StartFail Start Response FailState WaitState DoneFail Done Success EndState Developing & Delivering Multimodal Applications

  9. Example State Chart with Parallel States Prepare Voice Prepare GUI Prepare Response Fail Prepare Response Fail Prepare Response Success Prepare Response Success Start Voice Start GUI Start Fail Start Fail Start Response Fail Voice Start Response Fail GUI Done Fail Done Fail Wait Voice Wait GUI Done Success Done Success End Voice End GUI Developing & Delivering Multimodal Applications

  10. The Life Cycle Events prepare prepare SCXML prepareResponse prepareResponse XHTML VoiceXML start start SCXML startResponse startResponse XHTML VoiceXML cancel cancel SCXML cancelResponse cancelResponse XHTML VoiceXML pause pause SCXML pauseResponse pauseResponse XHTML VoiceXML resume resume SCXML resumeResponse resumeResponse XHTML VoiceXML Developing & Delivering Multimodal Applications

  11. More Life Cycle Events newContextRequest SCXML newContextRequest newContextResponse newContextResponse XHTML VoiceXML SCXML data data XHTML VoiceXML SCXML done XHTML clearContext clearContext SCXML XHTML VoiceXML Developing & Delivering Multimodal Applications

  12. Intent-based events Capture the underlying intent rather than the physical manifestation of user-SCXML events Independent of the physical characteristics of particular devices Data/reset Reset one or more field values to null Data/focus Focus on another field Data/change Field value has changed Synchronization Using the Lifecycle Data Event SCXML data data XHTML VoiceXML Developing & Delivering Multimodal Applications

  13. Modality Lifecycle Events between Interaction Manager and Modality Interaction Manager prepare Prepare State Prepare Response Fail prepare response (failure) Prepare Response Success) prepare response (success) start Start State start response (success) Start Fail Start Response FailState start response (failure) DoneFail WaitState data Done Success done EndState Developing & Delivering Multimodal Applications

  14. MMI Architecture Principles • Interaction manager communicates with Modality Components through asynchronous events • Modality Components don’t communicate directly with each other, but indirectly through the Interaction manager • Components must implement basic life cycle events, may expose other events • Modality components can be nested (e.g. a Voice Dialog component like a VoiceXML <form>) • Components need not be markup-based • EMMA communicates users’ inputs to the Interaction Manager Developing & Delivering Multimodal Applications

  15. GUI Modality (XHTML) Adapter converts Lifecycle events to XHTML events XHTML events converted to lifecycle events Modalities Interaction Manager (SCXML) Data Model XHTML VoiceXML 3.0 • Voice Modality (VoiceXML 3.0) • Lifecyle events are embeddedinto VoiceXML 3.0 Developing & Delivering Multimodal Applications

  16. VoiceXML supports Events sent from the Interaction Manager Sending events to the Interaction Manager. <form> <catch name="change"> <assign name="city" value="data"/> </catch> … <field name = "city"> <prompt> Blah </prompt> <grammar src="city.grxml"/> <filled><send event="data.change" data="city"/> </filled> </field> </form> Modalities Interaction Manager (SCXML) Data Model XHTML VoiceXML 3.0 Developing & Delivering Multimodal Applications

  17. XHTML is extended to send events to the Interaction Manager. <head>…<ev:Listener ev:event="onChange" ev:observer="app1" ev:handler="onChangeHandler()";>…<script>{function onChangeHandler()post ("data", data="city")}</script></head> … <body id="app1"? <input type="text" id=city "value= " "/></body> … Modalities Interaction Manager (SCXML) Data Model XHTML VoiceXML 3.0 Developing & Delivering Multimodal Applications

  18. XHTML is extended to support events received from the Interaction Manager <head>…<handler type="text/javascript“ ev:event="data" if (event="change" {document.app1.city.value="data.city"}</handler>…</head> … <body id="app1"? <input type="text" id="city" value=""/> </body>… Modalities Interaction Manager (SCXML) Data Model XHTML VoiceXML 3.0 Developing & Delivering Multimodal Applications

  19. References • SCXML • Second working draft available at http://www.w3.org/TR/2006/WD-scxml-20060124/ • Open Source available from http://jakarta.apache.org/commons/sandbox/scxml/ • Multimodal Architecture and Interfaces • Working draft available at http://www.w3.org/TR/2006/WD-mmi-arch-20060414/ • Voice Modality • First working draft VoiceXML 3.0 scheduled for November 2007 • XHTML • Full recommendation • Adapters must be hand-coded • Other modalities • TBD Developing & Delivering Multimodal Applications

  20. Availability • SAPI 5.3 • Microsoft Windows Vista® X+V • ACCESS Systems’ NetFront Multimodal Browser for PocketPC 2003 http://www-306.ibm.com/software/pervasive/multimodal/?Open&ca=daw-prod-mmb • Opera Software Multimodal Browser for Sharp Zaurus http://www-306.ibm.com/software/pervasive/ multimodal/?Open&ca=daw-prod-mmb • Opera 9 for Windows http://www.opera.com/ W3C • First working draft of VoiceXML 3.0 not yet available • Working drafts of SCXML are available; some open-source implementations are available Proprietary APIs • Available from vendor Developing & Delivering Multimodal Applications

  21. Final Advice • The W3C is defining a rich collection of languages for authoring multimodal applications • SCXML can be used as an Interaction Manager • Many languages for modalities: VoiceXML, XHTML, … • EMMA may be used to describe data transmitted among modules • W3C languages will be available on multiple platforms • Avoid getting locked into using proprietary languages available only on a single platform • The W3C languages will be available on multiple platforms Developing & Delivering Multimodal Applications

  22. Web Resources • http://www.w3.org/voice • Specification of grammar, semantic interpretation, and speech synthesis languages • http://www.w3.org/2002/mmi • Specification of EMMA and InkML languages • http:/www.microsoft.com (and query SALT) • SALT specification and download instructions for adding SALT to Internet Explorer • http://www-306.ibm.com/software/pervasive/multimodal/ • X+V specification; download Opera and ACCESS browsers • http://www.larson-tech.com/SALT/ReadMeFirst.html • Student projects using SALT to develop multimodal applications • http://www.larson-tech.com/MMGuide.html or http://www.w3.org/2002/mmi/Group/2006/Guidelines/ • User interface guidelines for multimodal applications Developing & Delivering Multimodal Applications

More Related