220 likes | 561 Views
Multimodal Interfaces. Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity resolution in recognition-based interfaces. Multimodal Interfaces I. An overview of the state of the art as of 2002 Definition
E N D
Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity resolution in recognition-based interfaces
Multimodal Interfaces I • An overview of the state of the art as of 2002 • Definition • 2 or more combined user input modes • Multimedia system output • At least one recognition-based technologies • Goal • To support more transparent, flexible, efficient, and powerfully expressive means in HCI • Questions • Is multimodality more about input than about output? Why or why not? • How are multimodal interfaces relate to ubicomp?
Multimodal Interfaces II • Benefits • Easier to learn and use by more people • Usable in more adverse conditions: adaptability • More robust and stable recognition: better error handling through mutual disambiguation
Multimodal Interfaces III • Input modes • (Keyboard, mouse,) speech, pen, touch, gestures, gaze, facial expressions… • Active vs. passive • Speech/pen input and speech & lip movements have been the focus; research on other modes is catching up • Question • Will keyboard and mouse stay as non-recognition-based input modes?
Multimodal Interfaces IV • Foundation and driving force • Cognitive science research • High-fidelity automatic simulations • Question • Any other major contributing factors?
Multimodal Interfaces V • Cognitive science underpinnings (1) • Users’ situational choice of input modes • Integration and synchronization • Individual differences vs. universal access • Habit, culture, gender, language, age, personality… • Complementarity vs. redundancy in integration • Questions • What other individual differences should be considered in multimodal interface design? • Any comments on complementarity and reduncancy?
Multimodal Interfaces V • Cognitive science underpinnings (2) • Different features of multimodal language • Brevity, semantic content, syntactic complexity, word order, disfluency rate, degree of ambiguity, referring expressions, specification of determiners, anaphora, deixis, and linguistic indirectness • Questions • “The medium is the message.” Is the mode the message??? Why or why not? • Will some these features disappear in the future?
Multimodal Interfaces VI • GUIs vs. Multimodal interfaces • Single event stream vs. parallel streams (of continuous and simultaneous input) • Atomic and unambiguous events vs. context and ambiguity handling • Small vs. large computational and memory requirements; centralized vs. distributed interface • Temporal constraints of mode fusion operations
Multimodal Interfaces VII • Architectures and processing techniques • 2 main subtypes of multimodal architecture • Feature level (or “early fusion”): more appropriate for temporally synchronized input modalities • Semantic level (or “late fusion”): good for less temporally coupled input modes • Processing • Frame-based integration • Logic-based: typed feature structures and unification-based integration • Hybrid symbolic/statistical processing
Multimodal Interfaces VIII • Future directions • Multidisciplinary cooperation • Input!!! Both active and passive forms are needed. • Adaptability (and adaptivity/context awareness?) • Question • How can we go beyond “model(ing) human-like sensory perception?”
Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • A discussion of issues about an integral part of multimodal interface—recognition-based interfaces • Existing error correction techniques • Repetition • 3 dimensions: modality, undo, and repair granularity • Choice • 5 dimensions: layout, instantiation time, contextual information, interaction, and feedback • Both are called “mediation techniques”
Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • OOPS (Organized Option Pruning System) • A toolkit that provides infrastructure for tracking and resolving ambiguity • Consistent, recognizer-independent internal model of ambiguous input that allows the separation of recognition, mediation, and application development • Reusable • Supports both automatic mediation and interactive choice and repetition techniques • Supports guided re-recognition • Has a library of standard mediators
Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • 4 problems and respective new mediation techniques • Fixed set of choices • Adding alternatives • Occlusion in choice mediators • No change to the underlying applications • Target ambiguity • (Other two major classes of ambiguity are recognition ambiguity and segmentation ambiguity) • Magnifying lens in the area of input • Errors of omission • (Some or all of the user’s input is not interpreted at all by the recognizer) • Guided re-recognition
Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • Question • What’s your take on OOPS?