580 likes | 825 Views
Leveraging Human Capabilities in Perceptual Interfaces. George G. Robertson Microsoft Research. Outline and Goal. What are perceptual interfaces? Perceptive vs perceptual Multimodal interfaces Challenge: Do our interfaces work? How do we find out? Challenge: Broaden our scope
E N D
Leveraging Human CapabilitiesinPerceptual Interfaces George G. Robertson Microsoft Research
Outline and Goal • What are perceptual interfaces? • Perceptive vs perceptual • Multimodal interfaces • Challenge: Do our interfaces work? • How do we find out? • Challenge: Broaden our scope • Leverage other natural human capabilities
Perceptive to Perceptual • Perceptive UI: aware of user • Input to computer: use human motor skills • Multimodal UI: use communication skills • We use multiple modalities to communicate • Perceptual UI: use many human abilities • Perception, cognition, motor, communication
What are Modalities? Sensations (hearing or seeing) Human communication channels
What are Multimodal Interfaces? • Attempts to use human communication skills • Provide user with multiple modalities • May be simultaneous or not • Fusion vs Temporal Constraints • Multiple styles of interaction
Examples • Bolt, SIGGRAPH’80 • Put That There • Speech and gestures used simultaneously
Examples (continued) • Buxton and Myers, CHI’86 • Two-handed input • Cohen et al, CHI’89 • Direct manipulation and NL • Hauptmann, CHI’89 • Speech and gestures
Examples (continued) • Bolt, UIST’92 • Two-handed gestures and Gaze • Blattner & Dannenberg, 1992 book • Hanne: text & gestures (interaction styles) • Pausch: selection by multimodal input • Rudnicky: speech, gesture, keyboard • Bier et al, SIGGRAPH’93 • Tool Glass; two-handed input
Examples (continued) • Balboa & Coutaz, Intelligent UI’93 • Taxonomy and evaluation of MMUI • Walker, CHI’94 • Facial expression (multimodal output) • Nigay & Coutaz, CHI’95 • Architecture for fused multimodal input
Why Multimodal Interfaces? • Now fall far short of human capabilities • Higher bandwidth is possible • Different modalities excel at different tasks • Errors and disfluencies reduced • Multimodal interfaces are more engaging
Leverage Human Capabilities • Leverage senses and perceptual system • Users perceive multiple things at once Leveragemotor and effector capabilities • Users do multiple things at once
Senses and Perception • Use more of user’s senses • Not just vision • Sound • Tactile feedback • Taste and smell (maybe in the future) • Users perceive multiple things at once • e.g., vision and sound
Motor & Effector Capabilities • Currently: pointing or typing • Much more is possible: • Gesture input • Two-handed input • Speech and NL • Body position, orientation, and gaze • Users do multiple things at once • e.g., speak and use hand gestures
Simultaneous Modalities? • Single modality at a time • Adapt to display characteristics • Let user determine input mode • Redundant, but only one at a time • Multiple simultaneous modalities • Two-handed input • Speech and hand gestures • Graphics and sound
Taxonomy (Balboa, 1993) Fusion Put that there click … click Put that click there click Synergetic multiple menu selection or multiple spoken commands Shortcuts Exclusive Temporal Constraints Independent Sequential Concurrent
Modality = Style of Interaction • Many styles exist • Command interface • NL • Direct manipulation (WIMP and non-WIMP) • Conversational (with an interface agent) • Collaborative • Mixed styles produce multimodal UI • Direct manipulation and conversational agent
Multimodal versus Multimedia • Multimedia is about media channels • Text, graphics, animation, video: all visual media • Multimodal is about sensory modalities • Visual, auditory, tactile, … • Multimedia is a subset of Multimodal Output
How Do The Pieces Fit? Perceptual UI Multimodal Input Multimodal Output Multimedia Perceptive UI
Challenge • Do our interfaces actually work? • How do we find out?
Why Test For Usability? • Commercial efforts require proof • Cost benefit analysis before investment • Intuitions are great for design • But intuition is not always right! • Peripheral Lens
Peripheral Vision • Does peripheral vision make navigation easier? • Can we simulate peripheral vision?
Peripheral Lens Intuitions • Locomotion should be easier • Especially around corners • Wayfinding should be easier • You can see far sooner
Peripheral Lens Findings • Lenses were about the same speed • Harder to use for inexperienced people • Corner turning was not faster
The Lesson • Do not rely solely on intuition • Test for usability!
Challenge • Are we fully using human capabilities? • Peceptive UI is aware of the body • Multimodal UI is aware the we use multiple modalities, sometimes simultaneous • Perceptual UI should go beyond both of these
Research Strategy Leverage Human Capabilities Exploit Technology Discontinuities Compelling Task: Information Access
Engaging Human Abilities • understand complexity • new classes of tasks • less effort communication perceptual cognitive motor Helps User
Language Gesture Awareness Emotion Multimodal Examples: Communication • Flexible • Robust • Dialogue to resolve ambiguity
Language Gesture Awareness Emotion Multimodal Examples: Communication • Hands • Body pose • Facial expression
Camera-BasedConversational Interfaces • Leverage face to face communication skills
Language Gesture Awareness Emotion Multimodal Examples: Communication • Is anybody there? • Doing what?
Camera-Based Awareness • What is the user doing?
Language Gesture Awareness Emotion Multimodal Examples: Communication • Social response • Perceived personality
Language Gesture Awareness Emotion Multimodal Examples: Communication • Natural • Choice • Reduces errors • Higher bandwidth
Bimanual skills Muscle memory Multimodal Map Manipulation Two hands Speech Examples: Motor Skills
Camera-Based Navigation • How do our bodies move when we navigate?
Spatial relationships Pattern recognition Object constancy Parallax Other Senses Examples: Perception Cone Tree Xerox PARC Information Visualizer
Spatial relationships Pattern recognition Object constancy Parallax Other Senses Key 3D depth cue Sensor issues Camera-based head-motion parallax Examples: Perception
Camera-Based Head-Motion Parallax • Motion parallax is one of strongest 3D depth cues
Spatial relationships Pattern recognition Object constancy Parallax Other Senses Auditory Tactile Kinesthetic Vestibular Taste Olfactory Examples: Perception
Examples: Perception Olfactory? Maybe soon? Ferris Productions Olfactory VR Add-on Time, April 29, 1996 Barfield & Danas Olfactory Displays Presence, Winter, 1995
Spatial memory Cognitive chunking Attention Curiosity Time Constants Examples: Cognition Data Mountain
Favorites Management Exploits: Spatial memory 3D perception Pattern recognition Advantages: Spatial organization Not page at a time 3D advantages with 2D interaction Data Mountain
Sample User Reaction “Strongest cue ... relative size” Subject Layout of 100 Pages