300 likes | 322 Views
Learn about the features, design principles, and top-level elements of SALT, a markup language that adds speech and telephony capabilities to web applications. Discover how SALT integrates speech with web pages and enables reuse of components across different devices.
E N D
ITCS 6010 SALT
SALT • Speech Application Language Tags (SALT) • Speech interface markup language • Extension of HTML and other markup languages • Adds speech and telephony features to Web applications and services for both voice only and multimodal browsers
SALT Overview • SALT • Small set of XML elements • Elements have: • Attributes • DOM (Document Object Model) object properties • Events • Methods • Applies speech to source page when used in conjunction with source markup document
SALT Design Principles • Clean integration of speech with Web pages • Leverages event-based DOM execution model of Web pages • Integrates cleanly into visual markup pages • Reuses knowledge and skill of Web developers • Does not reinvent page execution or programming models
SALT Design Principles (cont’d) • Separation of speech interface from business logic and data • Individual markup language not directly extended • Provides separate layer extensible across different markup languages • Allows for loose or tight coupling of speech interface to underlying data structure • Enables reuse of speech and dialog components across pages and applications
SALT Design Principles (cont’d) • Power and flexibility of programming model • SALT elements are simple and intuitive • Offer fine-level control of dialog execution through DOM event and scripting model • Leverages benefits of rich and well-understood execution environment
SALT Design Principles (cont’d) • Reuses existing standards for grammar, speech output and semantic results • Range of devices • Designed for range of architectural scenarios • Not for particular device type
SALT Design Principles (cont’d) • Minimal cost of authoring across modes and devices • Enables 2 important classes of application scenario • Multimodal • Visual page enhanced with speech interface on same device • Cross-modal • Single application page reused for different modes on different devices
Top-level Elements • There are 4 main top-level elements: • <prompt …> • For speech synthesis and prompt playing • <listen …> • For speech recognition • <dtmf …> • For configuration and control of DTMF collection • <smex …> • For general purpose communication with platform components
Top-level Elements • listen and DTMF elements • May contain <grammar> and <bind> elements • listen element • May contain <record> element
<listen> Element • Used for speech input • Specifies grammars • Specifies means of dealing with speech recognition results • Used for recording spoken input • Handles speech events and configures recognizer properties • Activates/deactivates grammars • Starts/stops recognition
<listen> Element (cont’d) • <listen> example <salt:listen id=“travel”> <salt:grammar src=“./city.xml” /> <salt:bind targetElement=“txtBoxOriginCity” value=“/result/origin_city” /> </salt:listen>
<listen> Element (cont’d) • <listen> element • Can be executed with Start() method in script • Can be executed declaratively in scriptless environment • Handlers include events for: • Successful recognitions • Misrecognitions • Timeouts • Each recognition event can be configured via attributes for: • Timeout periods • Confidence thresholds
<grammar> Element • Used to specify grammars • Inline or referenced • Multiple grammar elements may be used in single <listen> • Individual grammars may be activated/deactivated before recognition begins • Independent of grammar format • Will support at minimum XML form of W3C Speech Recognition Grammar Specification
<bind> Element • Used to inspect result of recognition • Conditionally copies relevant portions to values in page • Multiple bind elements may be used in single <listen> • Recognition result returned in XML document form • Uses XPath syntax in value attribute • Uses and XML pattern query in test attribute
<bind> Element (cont’d) • Value attribute • To reference particular node of result • Test attribute • To specify binding conditions • If condition evaluates to true, node content bound to page element specified by targetElement attribute
<bind> Element Example • Recognition example <result text=“I’d like to go to London, please” confidence=“0.45”> <dest_city text=“to London” confidence=“0.55”> London</dest_city> </result> • <bind> code <input name=“txtBoxDestCity” type=“text” /> <salt:listen ….> <salt:bind targetElement=“txtBoxDestCity” value=“/result/dest_city” test=“/result/dest_city[@confidence > 0.4]” /> </salt:listen>
<record> Element • Used to specify audio recording parameters • Results may be processed with bind or scripted code
<prompt> Element • Used to specify system output • Content may include: • Text • Speech output markup • Variable values • Links to audio files • Mix of any of the above
<prompt> Element (cont’d) • Executed in 2 ways ways: • Declaratively on scriptless browser • By object methods in script • Contains methods to start, stop, pause and resume prompt playback, and alter speed and volume • Handlers include events for user barge-in, prompt-completion and internal ‘bookmarks’
<prompt> Element Example <salt:prompt id=“ConfirmTravel”> So you want to travel from <salt:value targetElement=“txtBoxOriginCity” targetAttribute=“value” /> to <salt:value targetElement=“txtBoxDestCity” targetAttribute=“value” /> ? </salt:prompt>
<dtmf> Element • Used to specify DTMF grammars in telephony applications • Deals with keypress input and other events • Executed declaratively or programmatically with start and stop commands
<dtmf> Element (cont’d) • Main elements include <grammar> and <bind> • Holds resources for configuring DTMF collection process • Configured via attributes for configuring timeouts • Handlers include keypress events, valid dtmf sequences and out-of-grammar input
<dtmf> Element Example <salt:dtmf id=“dtmfPhoneNumber”> <salt:grammar src=“7digits.gram” /> <salt:bind value=“/result/phoneNumber” targetElement=“iptPhoneNumber” /> </salt:dtmf>
Event writing • SALT elements contain methods, properties and event handlers accessible to script • Enable interaction with other events and processes in Web page • Because SALT elements are XML objects in DOM of page
Event writing (cont’d) • Top-level elements contain asynchronous methods for initiation and completion of execution • Contain properties • For configuration and result storing • Event handlers • For events associated with speech
Event writing • onReco • Event fired when recognition results successfully returned • onBargein • Event fired on prompt element if user input received during prompt playback
Code Examples <input name=“txtBoxDestCity” type=“text” onclick=“recoDestCity.Start()” /> <salt:listen id="recoDestCity"> <salt:grammar src="city.xml" /> <salt:bind targetElement="txtBoxDestCity" value="/result/city" /> </salt:listen>
Code Examples (cont’d) <input type="button" onclick="recoFromTo.Start()" value="Say From and To Cities" /> <input name="txtBoxOriginCity" type="text" /> <input name="txtBoxDestCity" type="text" /> <salt:listen id="recoFromTo"> <salt:grammar src="FromToCity.xml" /> <salt:bind targetElement="txtBoxOriginCity" value="/result/originCity" /> <salt:bind targetElement="txtBoxDestCity" value="/result/destCity" /> </salt:listen>
<!—- HTML --> <html xmlns:salt="urn:saltforum.org/schemas/020124"> <body onload="RunAsk()"> <form id="travelForm"> <input name="txtBoxOriginCity" type="text" /> <input name="txtBoxDestCity" type="text" /> </form> <!—- Speech Application Language Tags --> <salt:prompt id="askOriginCity"> Where would you like to leave from? </salt:prompt> <salt:prompt id="askDestCity"> Where would you like to go to? </salt:prompt> <salt:prompt id="sayDidntUnderstand" onComplete="runAsk()"> Sorry, I didn't understand. </salt:prompt> <salt:listen id="recoOriginCity" onReco="procOriginCity()" onNoReco="sayDidntUnderstand.Start()"> <salt:grammar src="city.xml" /> </salt:listen> <salt:listen id="recoDestCity" onReco="procDestCity()" onNoReco="sayDidntUnderstand.Start()"> <salt:grammar src="city.xml" /> </salt:listen> <!—- script --> <script> function RunAsk() { if (travelForm.txtBoxOriginCity.value=="") { askOriginCity.Start(); recoOriginCity.Start(); } else if (travelForm.txtBoxDestCity.value=="") { askDestCity.Start(); recoDestCity.Start(); } } function procOriginCity() { travelForm.txtBoxOriginCity.value = recoOriginCity.text; RunAsk(); } function procDestCity() { travelForm.txtBoxDestCity.value = recoDestCity.text; travelForm.submit(); } </script> </body> </html>