250 likes | 341 Views
Creating User Interfaces. Discussion: current speech reco products. VoiceXML Homework: [Register as developer at studio.tellme.com. Do tutorials.] Come to class with phones (prepare to share) and be ready to start project!. Discussion. Reports on current speech products. Telephone.
E N D
Creating User Interfaces Discussion: current speech reco products. VoiceXML Homework: [Register as developer at studio.tellme.com. Do tutorials.] Come to class with phones (prepare to share) and be ready to start project!
Discussion • Reports on current speech products
Telephone • Caller to system: speech recognition, • using grammars (limited vocabulary, general audience, no training) • optional use of touch tones (numbers) • System to caller: recorded audio (wav files) plus TTS (text to speech) = speech synthesis • Limited bandwidth, in comparison to other applications, but very familiar, ubiquitous medium • 800 long distance, some airline information systems, others?
studio.tellme.com • Company that provides ‘engine’ for applications • Provides developing environment • We are doing the Tellme version of VoiceXML, but it appears to be standard. • Register as a developer: • Provide your own id; assigned a PIN • Put VoiceXML in ScratchPad place (no audio files) • 1-800-555-VXML (8965) • SAY id and then PIN or can give phone number. Tellme runs either • program in ScratchPad OR • program at Application URL for projects with multiple files • To look at someone else's project, you change your Application URL • called pointing your account to a new source.
Preparation: objects • JavaScript (and other languages) use classes and objects • Objects (aka object instances) are declared (created, instantiated) as members of a class • Objects have • properties ('the data') • methods (functions that you can use 'on' the objects) • static methods • Math.random
Example: tm_date • var dt = new tm_date; creates a date/time object. • Use methods to extract/manipulate information held 'in' dt. var day = dt.get_day(); • Use static methods supplied to do common tasks: var dn=tm_date.to_day_of_week_name(day); or directly: var dn=tm_date.to_day_of_week_name(dt.get_day());
outline • Header stuff • script with external reference • script (code) encased in CDATA notation • Form/Block, with text to speech using value produced by script • Closing stuff
<?xml version="2.0"?> <vxml> <script src="http://resources.tellme.com/lib/code/tm_date.js"/> Will make use of data functions
<script> <![CDATA[ var dt = new tm_date(); var monis = tm_date.to_month_name(dt.get_month()); var dateis = dt.get_date(); var dayis = tm_date.to_day_of_week_name(dt.get_day()); var yearis = tm_date.to_year_name(dt.get_full_year()); var houris= dt.get_hours() - 4; var minutesis=dt.get_minutes() var whole = 'The date is '+ monis+' '+dateis+'. It is ' + dayis+'. The time is ' + houris + ' ' + minutesis; ]]> </script> brute force correction from GMT
<form> <block>Hello. <value expr="whole"/> Good bye. </block> </form> </vxml> Can use block for audio
Example: my family • Directed responses to 3 family members: • Daniel, • question/response on activities • Aviva, • question/response on number of cranes • Esther • response • Calculations (arithmetic) done using variables • if tags • The cond attribute is a condition test. • limited error handled: exit on no-match event • alternative is to repeat prompt, generally using count attribute
<vxml version="2.0"> <form> <field name="childid"> <prompt> <audio src="whosthis.wav">Hello. Who is calling?</audio> </prompt>
<grammar type="application/x-gsl" mode="voice"> <![CDATA[ [ [dan daniel (daniel meyer) (dan meyer)] {<childid "daniel">} [aviva (aviva meyer)] {<childid "aviva">} [esther (esther minkin) ] {<childid "esther">} ] ]]> </grammar>
<catch event="noinput nomatch"> <audio src="sorry.wav">Sorry. I didn't get that.</audio> <exit/> </catch> <filled> <if cond="'daniel'==childid"> <goto next="#danfollowup"/> <elseif cond="'aviva'==childid"/> <goto next="#avivafollowup"/> <elseif cond="'esther'==childid"/> <goto next="#estherfollowup"/> <else/> <reprompt/> </if> </filled> </field> </form> never happens Note inner, single quote marks. Note double ='s
<form id="danfollowup"> <field name="today" > <prompt> <audio src="congratsdan.wav" >Congratulations on the new job. Did you work on your thesis, or do aikido or jo today?</audio> </prompt> <grammar type="application/x-gsl" mode="voice"> <![CDATA[ [ [aikido (i key dough)] {<today "aikido">} [thesis (work)] {<today "thesis">} [jo (joe) ] {<today "jo">} [both (all) (everything) ((i key dough) jo)]{<today "both">} [none nothing (sort of)] {<today "nothing">} ] ]]> </grammar> <catch event="noinput nomatch"> <audio >I didn't quite understand. Call or send e-mail.</audio> <exit/> </catch>
<filled> <if cond="today=='aikido'" > <audio>Some aikido is fine. </audio> <elseif cond="today=='thesis'" /> <audio>Good, but do other things also.</audio> <elseif cond="today=='jo'" /> <audio>don't get hit in the head.</audio> <elseif cond="today=='both'" /> <audio>Doing some of everything is best. </audio> <elseif cond="today=='nothing'"/> <audio> You deserve a break, but remember you want to be done by September. </audio> <else/> <audio> See you soon.</audio> </if> </filled> </field> <block> <audio> Good bye </audio> </block> </form>
<form id="avivafollowup"> <var name="rest" expr="1000"/> <field name="bcount" type="number"> <prompt> <audio src="howmanycranes.wav">Hello, Aviva. How many cranes have you made? </audio> </prompt> <grammar type="application/x-gsl" mode="voice" > <![CDATA[ NATURAL_NUMBER_THRU_9999 ]]> </grammar> <catch event="noinput nomatch"> <audio src="sorry.wav">Sorry. I didn't get that.</audio> <exit/> </catch>
can't use < <filled> <assign name="rest" expr="1000-bcount"/> <audio> <value expr="rest" /> </audio> <audio src="togo.wav"> to go. </audio> <if cond="rest<200" > <audio src="homestretch.wav">You're in the home stretch </audio> <elseif cond="rest<500" /> <audio src="morethanhalf.wav">More than half way </audio> <elseif cond="rest<800" /> <audio src="goodstart.wav">Off to a good start </audio> <else/> <audio> Get a move on </audio> </if> <audio src="goodbye.wav">Good bye. </audio> </filled> </field> </form>
<form id="estherfollowup"> <block> <audio >Hello, Mommy. This is all I can do now. </audio> </block> </form> </vxml>
[again] Application logic • Implicitly in way menus and grammars work • VoiceXML elements (for example, <if> and <var>. • JavaScript code in attributes (for example, cond, expr) • JavaScript code in <script> </script> • Encase in CDATA to avoid problems with certain characters • external JavaScript code, cited using <script src=file address />
Speech recording • Best practices is to use recorded speech (audio) plus text for Text To Speech. • You can use Tellme for recording. Audio file sent to your email for you to upload to server. • Note: if you set preferences to always go to your account, you need to change to pick the Record option when you call the number.
[advanced] features • Data element: you can construct XML file on server and have VoiceXML access it. • Note: you can use src or srcexpr, the latter constructs the URL using information (presumably) just calculated. • Mixed initiative: provides way to set up subgrammars to get multiple items of information. Tellme studio presents the airport from and to example. • Barge-in: features to determine how much of something was heard. Uses marks.
Class work [if time] • EVERYONE (who hasn't already) signup studio.tellme.com • Design SIMPLE application (you may work in groups): • Ask one question • Detect and respond to each of 2 or 3 answers • Use examples here for models • All text to speech • Pick (at least) one and implement.
Homework • Go to studio.tellme.com • [signup as developer] • try examples (using scratch pad) • record some voice samples • Study tellme tutorials!!!! Note: final project will be a tellme application, may be done in teams of 2 or 3.