570 likes | 837 Views
Attention and Perception of Attentive Behaviours for Emotive Interaction. Christopher Peters LINC University of Paris 8. Virtual Humans. Computer models of people Can be used as… substitutes for the real thing in ergonomic evaluations Conversational agents. Display and Animation:
E N D
Attention and Perceptionof Attentive Behavioursfor Emotive Interaction Christopher Peters LINC University of Paris 8
Virtual Humans • Computer models of people • Can be used as… • substitutes for the real thing in ergonomic evaluations • Conversational agents • Display and Animation: • Two layers: skeletal layer and skin layer • Skeleton is hierarchy of positions and orientations • Skin layer provides the visual appearance
Animating Characters • Animation Methods • Low level: rotate leye01 by 0.3 degrees around axis (0.707,0.707,0.0) at 0.056 seconds into the animation • High level: ‘walk to the shops’ • Character must know where shop is, avoid obstacles on the way there, etc. Must also be able to walk … • Autonomy • Direct animation automatic generation • Autonomy requires character to animate itself based on simulation models • Models should result in plausible behaviour
Our Focus • Attention and related behaviours • (a) Where to look • (b) How to generate gaze behaviour • Perception of attentive behaviours and emotional significance • How to interpret attention behaviours of others • Conversation initialisation in Virtual Environments…
Why VE? • Cheap! • No need for expensive equipment (facilities, robots, etc) • Duplication at the click of a mouse • Quick • Changes to environment can be made quickly and easily, at no extra cost • But… • Things we take for granted in RL need to be programmed into the virtual environment • Physics • And will only ever be approximations of reality
1. Attention and Gaze • Our character is walking down a virtual street • Where should the character look and how should the looking behaviours be generated?
Foundation • Humans need to look around • An ecological approach to visual perception, J.J. Gibson, 1979. • Eyes in the front of our head • Poor acuity over most of visual field • Even for places where we have been before, memory is far from perfect • Virtual humans should look around too! Ilab, University of Southern California
Significance to Virtual Humans • Viewer perception • The impact of eye gaze on communication using humanoid avatars, Garau et al., 2001. • Plausibility • “If they don’t look around, then how do they know where things are?” Human viewer
Significance to Virtual Humans • Functional purposes • Navigation for digitial actors based on synthetic vision, memory and learning, Noser et al., 1995. • Autonomy • If they don’t look around, then they won’t know where things are
Our Focus • Gaze shifts versus saccadic eye movements • General looking behaviours • Where to Look? Automating Certain Visual Attending Behaviors of Human Characters, Chopra-Khullar, 1999. • Practical Behavioural Animation Based On Vision and Attention, Gillies, 2001. • Two problems: • Where to look • How to look
Approach • Use appropriately simplified models from areas such as psychology, neuroscience, artificial intelligence … • Appropriate = fast, allowing real-time operation • Capture the high-level salient aspects of such models without the intricate detail • Components • Sensing • Attention • Memory • Gaze Generator Where to look How to look
System Overview Input environment through synthetic vision component • Process visual field using spatial attention model • Modulate attended object details using memory component • Generate gaze behaviours towards target locations
Visual Sensing • Three renderings taken per visual update • One full-scene rendering ( to attention module) • Two false-colour renderings ( to memory module)
False-colour Renderings • Approximate acuity of the eye with two renderings • Fovea • Periphery Viewpoint Fovea Periphery
Renderings • Renderings allow both spatial/image and object based operations to take place
1(a) Where to look • Model of Visual Attention • Two-component theory of attention • “Bottom-up” • Endogenous • Voluntary, task driven • ‘Look for the coke can’ • “Top-down” • Exogenous • Environment appears to ‘grab’ our attention • Colour, intensity, orientation, motion, texture, sudden onset, etc
Bottom-up Attention Orientation, intensity and colour contrast
Bottom-up Attention • Model • Cognitive engineering • Itti et al. 2000 • http://ilab.usc.edu/bu/ • Biologically inspired • Inputs an image, outputs encoding of attention allocation • Peters and O’ Sullivan 2003
Input Image Intensity RG Colour BY Colour
Gaussian Pyramid • Each channel acts as the first level in a Gaussian or Gabor pyramid • Each subsequent level is a blurred and decimated version of the previous level • Image processing techniques simulate early visual processing
Center-Surround Processing • Early visual processing • Ganglion cells • Respond to light in a center-surround pattern • Contrast a central area with its neighbours • Simulated by comparing different levels in image pyramids • Contrast important, not amplitude (CONTEXT)
Saliency Map • Conspicuity Maps • Result of center-surround calculations for each feature type • Define the ‘pop-out’ for each feature type • Integrated into saliency map • Attention directed preferably to lighter areas Input Intensity Colour Orientation Saliency Map
Memory • Differentiate between what an agent has and hasn’t observed • Agents should only know about objects that they have witnessed • Agents won’t have exact knowledge about world • Used to modulate output of attention module (saliency map) • Object-based, taking input from synthetic vision module
Stage Theory • The further information goes, the longer it is retained • Attention acts as a filter
Stimulus Representations • Two levels of detail representation for objects • Proximal stimuli • Early representation of the stimulus • Data discernable only from retinal image • Observations • Later representation of stimuli after resolution with the world database
Stage Theory • Short-term Sensory Storage (STSS) • From distal to proximal stimuli • Objects have not yet been resolved with world database
Stage Theory • Short-term memory (STM) and Long-Term Memory (LTM) • Object-based • Contains resolved object information • From proximal stimuli to observations • Observations store information for attended objects • Object pointer • World-space transform • Timestamp • Virtual humans are not completely autonomous from the world database
Memory Uncertainty Map • Can now create a memory uncertainty map for any part of the scene the agent is looking at • The agent is uncertain of parts of the scene it has not looked at before • Depends on scene object ‘granularity’
Attention Map • Determines where attention will be allocated to • Bottom-up components • Top-down (see 2) • Memory • Modulating the saliency map by the uncertainty map • Here, sky and road have low uncertainty levels
Human Scanpaths Eye movements and fixations
Inhibition of Return • Focus of attention must change • Inhibit attended parts of the scene from being revisited soon • Image-based IOR • Problem: Moving viewer or dynamic scene • Solution: Object based memory • Object-based IOR • Store uncertainty level with each object • Modulate saliency map by uncertainty levels
Artificial Regions of Interest • Attention map at lower resolution than visual field • Generate AROIs from highest values of current attention map to create scanpath • Assume simple one-to-one mapping from attention map to overt attention
1(b) How to look • Generate gaze animation given a target location • Gaze shifts • Combined eye-head gaze shifts to visual and auditory targets in humans, Goldring et al., 1996. • Targets beyond oculomotor range
Gaze Shifts • Contribution of head movements • Head Movement Propensity, J. Fuller, 1992. • ‘Head movers’ Vs. ‘Eye movers’ • ±40 degree orbital threshold • Innate behavioural tendancy for subthreshold head moving • Midline-attraction and Resetting
Blinking • Subtle and often overlooked • Not looking while leaping: the linkage of blinking and saccadic gaze shifts, Evinger et al., 1994. • Gaze-evoked blinking • Amplitude of gaze shift influences blink probability and magnitude
2. Perception of Attention • Attention behaviours may elicit attention from others • Predator-prey • Gaze-following • Goals • Intentions
Gaze in Infants • Infants • Notice gaze direction as early as 3 months • Gaze-following • Infants are faster at looking at targets that are being looked at by a central face • Respond even to circles that look like eyes www.mayoclinic.org
Theory of Mind • Baron-Cohen (1994) • Eye Direction and Intentionality Detectors • Theory of Mind Module • Perrett and Emery (1994) • More general Direction of Attention Detector • Mutual Attention Mechanism
Our Model • ToM for conversation initiation • Based on attention behaviours • Key metrics in our system are Attention Levels and Level of Interest • Metrics represent the amount of attention perceived to have been paid by another • Based primarily on gaze • Also body direction, locomotion, directed gestures and facial expressions • Emotional significance of gaze
Implementation (in progress) • Torque game engine • http://www.garagegames.com • Proven engine used for number of AAA titles • Useful basis providing fundamental functionality • Graphics exporters • In-simulation editing • Basic animation • Scripting • Terrain rendering • Special effects
Synthetic Vision • Approximated human vision for computer agents • Why? • Inexpensive – no special hardware required • Bypasses many computer vision complexities • Segmentation of images, recognition • Enables characters to receive visual information in a way analogous to humans • How? • Updated in a snapshot manner • Small, simplified images rendered from agents perspective • Textures, lighting and sfx disabled • False-colouring
False-colours provide a look-up scheme for acquiring objects from the database • False colour defined (r,g,b) where • Red is the object type identifier • Green is the object instance identifier • Blue is the sub-object identifier • Allows quick retrieval of objects
Intentionality Detector(ID) • Represents behaviour in terms of volitional states (goal and desire) • Based on visual, auditory and tactile cues • Our version only based on vision • Attributes intentionality characteristic to objects based on the presence of certain cues • Implemented as a filter on objects from visual system • Only “agent” objects can pass the filter
Direction of Attention • Direction of Attention Detector (DAD) • More useful than EDD alone • Eye, head, body and locomotion direction read from database after false-colour lookup • Used to derive Attention Level metric from filtered stimuli
Direction of Attention • What happens when eyes aren’t visible? • Hierarchy of other cues • Head direction > Body direction > Locomotion direction
Mutual Attention • Comparison between: • Eye direction read from other agent • Focus of attention of this agent • See 1. Generating Attention Behaviours • If agents are focus of each others attention, Mutual Attention Mechanism (MAM) is activated
Attention Levels • Perception of attention paid by another • At instant of time • Based on orientation of body parts • Eyes, head, body, locomotion direction
Attention Levels • Direction is weighted for each segment • Eyes provide largest contribution Less attention