1 / 62

Control of Attention and Gaze in Natural Environments

Control of Attention and Gaze in Natural Environments. Selecting information from visual scenes. What controls the selection process?. Fundamental Constraints Acuity is spatially restricted. Attention is limited. Visual Working Memory is limited.

urvi
Download Presentation

Control of Attention and Gaze in Natural Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Control of Attention and Gaze in Natural Environments

  2. Selecting information from visual scenes What controls the selection process?

  3. Fundamental Constraints Acuity is spatially restricted. Attention is limited. Visual Working Memory is limited. Humans must select a limited subset of the available information in the environment. Only a limited amount of information can be retained. What controls these processes?

  4. Saliency - bottom-up Image properties eg contrast, edges, chromatic saliency can account for some fixations when viewing images of scenes.

  5. Limitations of Saliency Important information may not be salient eg Stop signs in a cluttered environment. Salient information may not be important - eg retinal image transients from eye/body movements. Doesn’t account for many observed fixations, especially in natural behavior (eg Land etc).

  6. Need to Study Natural Behavior Natural vision is not the same as viewing pictures. Behavioral goals determine what information is needed. Task structure (often) allows interpretation of role of fixations.

  7. Top-down factors Viewing pictures of scenes is different from acting within scenes.

  8. Top-down factors Viewing pictures of scenes is different from acting within scenes. Heading Obstacle avoidance Foot placement

  9. To what extent is the selection of information from scenes determined by cognitive goals (ie top-down) and how much by the stimulus itself (ie salient regions - bottom-up effects)?

  10. Modeling Top Down Control Walter the Virtual Humanoid • Virtual Humanoid has a small library of simple visual behaviors: • Sidewalk Following • Picking Up Blocks • Avoiding Obstacles Sprague & Ballard (2003) Each behavior uses a limited,task-relevant selection of visual information from scene. This is computationally efficient.

  11. Walter’s sequence of fixations litter obstacles sidewalk Walter learns where/when to direct gaze using reinforcement learning algorithm.

  12. Walter the Virtual Humanoid Sprague & Ballard (VSS 2004) What about unexpected events?

  13. Dynamic Environments

  14. Driving Simulator

  15. Gaze distribution depends on tasks Follow Obey Traffic Rules Time fixating Intersection.

  16. The Problem Any selective perceptual system must choose the right visual computations, and when to carry them out. How do we deal with the unpredictability of the natural world? Answer - it’s not all that unpredictable and we’re really good at learning it.

  17. Human Gaze Distribution when Walking • Experimental Question: How sensitive are subjects to unexpected salient events? • General Design: Subjects walked along a footpath in a virtual environment while avoiding pedestrians. Do subjects detect unexpected potential collisions?

  18. Virtual Walking Environment Virtual Research V8 Head Mounted Display with 3rd Tech HiBall Wide Area motion tracker V8 optics with ASL501 Video Based Eye Tracker (Left) and ASL 210 Limbus Tracker (Right) D&c emily Limbus Tracker Video Based Tracker

  19. Virtual Environment Monument Bird’s Eye view of the virtual walking environment.

  20. Experimental Protocol • 1 - Normal Walking: Avoid the pedestrians while walking at a normal pace and staying on the sidewalk. • 2 - Added Task: Identical to condition 1. However, the additional instruction of following a yellow pedestrian was given Normal walking Follow leader

  21. What Happens to Gaze in Response to an Unexpected Salient Event? Pedestrians’ paths Colliding pedestrian path • TheUnexpected Event: Pedestrians on a non-colliding path changed onto a collision course for 1 second (10% frequency). Change occurs during a saccade. Does a potential collision evoke a fixation?

  22. Fixation on Collider

  23. No Fixation During Collider Period

  24. Probability of Fixation During Collision Period Pedestrians’ paths Colliding pedestrian path More fixations on colliders in normal walking. No effect in Leader condition Normal Walking Follow Leader Controls Colliders

  25. Why are colliders fixated? Small increase in probability of fixating the collider. Failure of collider to attract attention with an added task (following) suggests that detections result from top-down monitoring.

  26. Detecting a Collider Changes Fixation Strategy Time fixating normal pedestrians following detection of a collider Normal Walking Follow Leader “Miss” “Hit” Longer fixation on pedestrians following a detection of a collider

  27. Subjects rely on active search to detect potentially hazardous events like collisions, rather than reacting to bottom-up, looming signals. To make a top-down system work, Subjects need to learn statistics of environmental events and distribute gaze/attention based on these expectations.

  28. Possible reservations… Perhaps looming robots not similar enough to real pedestrians to evoke a bottom-up response.

  29. Walking -Real World • Experimental question: Do subjects learn to deploy gaze in response to the probability of environmental events? • General design: Subjects walked on an oval path and avoided pedestrians

  30. Experimental Setup A subject wearing the ASL Mobile Eye System components: Head mounted optics (76g), Color scene camera, Modified DVCR recorder, Eye Vision Software, PC Pentium 4, 2.8GHz processor

  31. Experimental Design (ctd) • Occasionally some pedestrians veered on a collision course with the subject (for approx. 1 sec) • 3 types of pedestrians: Trial 1: Rogue pedestrian - always collides Safe pedestrian - never collides Unpredictable pedestrian - collides 50% of time Trail 2: Rogue Safe Safe Rogue Unpredictable - remains same

  32. Fixation on Collider

  33. Effect of Collision Probability • Probability of fixating increased with higher collision probability.

  34. Detecting Collisions: pro-active or reactive? • Probability of fixating risky pedestrian similar, whether or not he/she actually collides on that trial.

  35. Learning to Adjust Gaze • Changes in fixation behavior fairly fast, happen over 4-5 encounters (Fixations on Rogue get longer, on Safe shorter)

  36. Shorter Latencies for Rogue Fixations • Rogues are fixated earlier after they appear in the field of view. This change is also rapid.

  37. Effect of Behavioral Relevance Fixations on all pedestrians go down when pedestrians STOP instead of COLLIDING. STOPPING and COLLIDING should have comparable salience. Note the the Safe pedestrians behave identically in both conditions - only the Rogue changes behavior.

  38. Fixation probability increases with probability of a collision. • Fixation probability similar whether or not the pedestrian collides on that encounter. • Changes in fixation behavior fairly rapid (fixations on Rogue get longer, and earlier, and on Safe shorter, and later)

  39. Our Experiment: Allocation of gaze when driving. Effect of task on gaze allocation. Does task affect ability to detect unexpected events? Drive along street with other cars and pedestrians. 2 instructions - drive normally or follow a lead car. Measure fixation patterns in the two conditions.

  40. Avatar path Human path Reward weights estimated from human behavior using Inverse Reinforcement Learning - Rothkopf 2008.

  41. Conclusions Subjects must learn the probabilistic structure of the world and allocate gaze accordingly. That is, gaze control is model-based. Subjects behave very similarly despite unconstrained environment and absence of instructions. Control of gaze is proactive, not reactive, and thus is model based. Anticipatory use of gaze is probably necessary for much visually guided behavior.

  42. Behaviors Compete for Gaze/ Attentional Resources The probability of fixation is lower for both Safe and Rogue pedestrians in both the Leader conditions than in the baseline condition . Note that all pedestrians are allocated fewer fixations, even the Safe ones.

  43. Conclusions Data consistent with task-driven sampling of visual information rather than bottom up capture of attention - No effect of increased salience of collision event. - Colliders fail to attract gaze in the leader condition, suggesting the extra task interferes with detection. Observers rapidly learn to deploy visual attention based on environmental probabilities. Such learning is necessary in order to deploy gaze and attention effectively.

More Related