1 / 14

Multimodal Interfaces

Multimodal Interfaces. Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity resolution in recognition-based interfaces. Multimodal Interfaces I. An overview of the state of the art as of 2002 Definition

gkaiser
Download Presentation

Multimodal Interfaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity resolution in recognition-based interfaces

  2. Multimodal Interfaces I • An overview of the state of the art as of 2002 • Definition • 2 or more combined user input modes • Multimedia system output • At least one recognition-based technologies • Goal • To support more transparent, flexible, efficient, and powerfully expressive means in HCI • Questions • Is multimodality more about input than about output? Why or why not? • How are multimodal interfaces relate to ubicomp?

  3. Multimodal Interfaces II • Benefits • Easier to learn and use by more people • Usable in more adverse conditions: adaptability • More robust and stable recognition: better error handling through mutual disambiguation

  4. Multimodal Interfaces III • Input modes • (Keyboard, mouse,) speech, pen, touch, gestures, gaze, facial expressions… • Active vs. passive • Speech/pen input and speech & lip movements have been the focus; research on other modes is catching up • Question • Will keyboard and mouse stay as non-recognition-based input modes?

  5. Multimodal Interfaces IV • Foundation and driving force • Cognitive science research • High-fidelity automatic simulations • Question • Any other major contributing factors?

  6. Multimodal Interfaces V • Cognitive science underpinnings (1) • Users’ situational choice of input modes • Integration and synchronization • Individual differences vs. universal access • Habit, culture, gender, language, age, personality… • Complementarity vs. redundancy in integration • Questions • What other individual differences should be considered in multimodal interface design? • Any comments on complementarity and reduncancy?

  7. Multimodal Interfaces V • Cognitive science underpinnings (2) • Different features of multimodal language • Brevity, semantic content, syntactic complexity, word order, disfluency rate, degree of ambiguity, referring expressions, specification of determiners, anaphora, deixis, and linguistic indirectness • Questions • “The medium is the message.” Is the mode the message??? Why or why not? • Will some these features disappear in the future?

  8. Multimodal Interfaces VI • GUIs vs. Multimodal interfaces • Single event stream vs. parallel streams (of continuous and simultaneous input) • Atomic and unambiguous events vs. context and ambiguity handling • Small vs. large computational and memory requirements; centralized vs. distributed interface • Temporal constraints of mode fusion operations

  9. Multimodal Interfaces VII • Architectures and processing techniques • 2 main subtypes of multimodal architecture • Feature level (or “early fusion”): more appropriate for temporally synchronized input modalities • Semantic level (or “late fusion”): good for less temporally coupled input modes • Processing • Frame-based integration • Logic-based: typed feature structures and unification-based integration • Hybrid symbolic/statistical processing

  10. Multimodal Interfaces VIII • Future directions • Multidisciplinary cooperation • Input!!! Both active and passive forms are needed. • Adaptability (and adaptivity/context awareness?) • Question • How can we go beyond “model(ing) human-like sensory perception?”

  11. Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • A discussion of issues about an integral part of multimodal interface—recognition-based interfaces • Existing error correction techniques • Repetition • 3 dimensions: modality, undo, and repair granularity • Choice • 5 dimensions: layout, instantiation time, contextual information, interaction, and feedback • Both are called “mediation techniques”

  12. Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • OOPS (Organized Option Pruning System) • A toolkit that provides infrastructure for tracking and resolving ambiguity • Consistent, recognizer-independent internal model of ambiguous input that allows the separation of recognition, mediation, and application development • Reusable • Supports both automatic mediation and interactive choice and repetition techniques • Supports guided re-recognition • Has a library of standard mediators

  13. Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • 4 problems and respective new mediation techniques • Fixed set of choices • Adding alternatives • Occlusion in choice mediators • No change to the underlying applications • Target ambiguity • (Other two major classes of ambiguity are recognition ambiguity and segmentation ambiguity) • Magnifying lens in the area of input • Errors of omission • (Some or all of the user’s input is not interpreted at all by the recognizer) • Guided re-recognition

  14. Interaction Techniques for Ambiguity Resolution in Recognition-Based Interfaces • Question • What’s your take on OOPS?

More Related