220 likes | 359 Views
Today and the Future of Wearable Agents Emmett Coin Director of Speech Research and Development SpeechTEK 2007 West, February 21, 2007. what I will talk about. Definition of a wearable voice agent Overview of voice-based agents in logistics the subsystems in a real world voice application
E N D
Today and the Future of Wearable AgentsEmmett CoinDirector of Speech Research and DevelopmentSpeechTEK 2007 West, February 21, 2007
what I will talk about • Definition of a wearable voice agent • Overview of voice-based agents in logistics • the subsystems in a real world voice application • Some of the more difficult issues • Futures • Summary • Conclusion
what is a “Wearable Voice Agent”? • NOT just a voice app on a cell phone or PDA • Voice dialing • Stock quote • One shot use • Rather it IS a partner that: • Complements the task (a teammate) • Adds value (faster, more accurate, less injury etc.) • Used for extended periods of time (maybe all day long) • Requires no (or very little) hand/eye time • Is as small as possible • Becomes Invisible (forget that the device is there)
some examples • Currently • Battlefield: Translation • Inspection: Insurance, Q/C • Logistics: Distribution Centers • Consumer: GPS route computers • Very Near Future • Retail: Extend Distribution to the Sales Clerk • Consumer: Organize lists and errands • Industry: Process Control
voice in logistics • Distribution Centers • The way we move the vast majority of products from manufacturer to consumer • Moving from many homogeneous collections to many heterogeneous collections • Many Suppliers (send product TO the Center) • Many Stores (receive product FROM the Center) • A massive repackaging task • A sizable fraction of the cost retail products • One of the biggest sectors of “wearable voice agents” to date
a “Selector” talking to Jennifer • The agent tells the human: • where to go • what to “select” • how many to “select” • where to put the item(s) • The human tells the agent: • Location checkstring • Quantity selected • If the bin is empty
some things that just happened • Selector was directed to product • Location was verified • Some product had unique weights entered • Others had expiration dates to verify • Selector needed the agent to repeat • Selector was lifting (80 lbs), walking, driving, reading, etc. while talking
fast interaction • Overlapped dialog • Look ahead • Independent use • Eyes • Hands • Speech • Natural corrections • Low cognitive load
accommodation • Linguistics • Finishing each others sentences • The classic “barge-in” • The never (but maybe soon) seen “interruption” • Expectation • Predicting dialog flow • When the response is marginal but expected • Response is “legal” but how probable?
accommodation example: agent side • In conventional voice applications the prompts need to be clear and unambiguous. • But for an agent “co-worker” this would be tedious. • In the beginning a natural prompt speed is best for learning the routine. • Later, however, “natural” will feel like “slow-mo” and must be “snappier”. • Later still, the human and agent know each other well and just cut to the chase further shortening the prompt.
components of a voice agent • Small device • Light weight, long battery life, rugged • Speech Technologies • Recognition,Text-to-Speech and recorded waves • Multi-Modal fits in here too. • Dialog Management • A core system that controls the goals of the interaction • Connectivity • The “real” work usually involves information external to the agent
simple view of a generic voice platform • Most “PDA”-like platforms run some version of Windows CE or Windows Mobile • They need full-duplex GOOD quality audio IO • Enough “cycles” to do the ASR and TTS • Low level control over “power management”
some industrial hardware platforms devicesSmall_3.JPG devicesSmall_3.JPG
would regular folks “talk” with a computer? • Obviously Hands and Eyes free • Grocery shopping • Assembling a child’s toy • Cooking a new recipe • We think differently (freely? Innovatively?) when we talk • Talking is a low (perceived) cognitive load • People get “writers block” more often than “talkers block” • To off load and manage the fussy details of our lives
Futures • The latest cell phones have the power to support a voice-based agent. • They cost 1/10th of a present day industrial device • It is just a matter of time before we talk TO our phone as well as ON it.
Summary • Wearable voice agents • Have been here for a while • Proven and make good business sense • Declining in cost • Expanding the range of worker multi-tasking • Can be effortless to use
Conclusions • They are more places than you think • They are REAL TOOLS not window dressing • They are just in their infancy • I am looking forward to my next new synthetic agent!
Thank you! • Contact: • Emmett Coin • Director of Speech Research and Development • coin@lucasware.com • 724 940 7041 • www.lucasware.com