180 likes | 283 Views
Applications of distributed dialogue systems: The KTH Connector. Jens Edlund Anna Hjalmarsson Aalborg, November 10th, 2005. Why spoken interfaces?. Speech is good because it is Hands free Eyes free Intuitive – already known Robust – e.g. redundancy, grounding Flexible Responsive
E N D
Applications of distributed dialogue systems:The KTH Connector Jens Edlund Anna Hjalmarsson Aalborg, November 10th, 2005
Why spoken interfaces? Speech is good because it is • Hands free • Eyes free • Intuitive – already known • Robust – e.g. redundancy, grounding • Flexible • Responsive • Efficient • …
What do they do? • Purpose • Problem solving • Information seeking • Transactions • Control • … • Initiative • System • User • Mixed • Modality • Multimodal (input and/or output) • Unimodal (input and/or output) • …
What else do they do? Spoken interfaces often replace or complement existing automated interfaces, for example:
Alternative interface systems • Speech is an alternative or substitute • Commonly built to be as good as or better than the corresponding system • Symmetry often required – what can be done with the original system should be doable with speech and vice versa
These aspects are often not exploited much in alternative interface systems Why spoken interfaces? • Hands free • Eyes free • Intuitive – already known • Robust – e.g. redundancy, grounding • Flexible • Responsive • Efficient • … • Hands free • Eyes free • Intuitive – already known • Robust – e.g. redundancy, grounding • Flexible • Responsive • Efficient • …
What metaphor to rely on? • The voice as an input device? • ”You may use your voice to order”From a travel booking instruction in Swedish • ”It didn’t give me any alternatives”From a post interview with call routing user • The computer as a human? • Problem: Turing test
Summary • Speech can be used successfully as an alternative or complementary interface to other interfaces, particularly when hands and/or eyes are occupied, disabled, or otherwise impratcical to use • The advantages of speech promised by analogies to human-human communication may not be fully exploited in such domains
The KTH Connector • Background • Domain • System
Background: CHIL • CHIL: Computers in the Human Interaction Loop (EU funded, IP506909) • The dialogue system as an unobtrusive conversational partner in a group of humans
A telephony based secretary Wide range of complexity • From answer phone…
A telephony based secretary • …to meeting assistant
Dialogue setup Multimodal, multiparty, system barge-in Multiparty telephony
System highlights System may dial users Prosody enhanced endpointing System output is included in discourse model (incrementally)
Research highlight: Responsiveness • When should we respond? • Turn yielding & turn holding cues: • Prosody • Gaze • …
Research highlight: Incrementality • When can we respond? • After a ”long enough” silence? • At some semantic or syntactic completeness? • After any ”word”? • Anytime? • What has actually happened? • System and user barge-in • Keep track of what we say, as well as what the user says
Research highlight: Unobtrusiveness • With what should we respond? • Long prompts can be annoying • Short ones may be insufficient • How? • Efficiently or politely? • Speech, gesture, other? • Do we have to ”take turn”? • Backchannels • Grounding