Attention, Awareness, and the Computational Theory of Surprise

Attention, Awareness, and the Computational Theory of Surprise Research Qualifying Exam August 30th, 2006

Outline • Introduction • Background • Case Study • Problem Definition • Approach • Results & Work to Date • Future Work

Introduction: Intelligent Machines Office Defense Exploration Home

Introduction: Autonomy • Aspects of Autonomy • Sensing • Ability to sense the environment • Processing • Ability to make decisions about the sensed environment • Mobility • Ability to move about the sensed environment

Introduction • Where are we now? • Sensors • cheaper • more reliable • more accurate • Data Association • engineered solutions for specific problems in a given environment • few solutions for unforeseen problems in potentially changing environments

Introduction • QUESTION: What principles from biological system can we borrow to handle unforeseen problems in dynamic settings?

Background: Attention • Definition • 1a: the act or state of attending especially through applying the mind to an object of sense or thought • 1b: a condition of readiness for such attention involving especially a selective narrowing or focusing of consciousness and receptivity –(Merriam-Webster Dictionary)

Background: Theories of Attention • Feature Integration Theory of Attention • Treisman (1980) • “Features are registered early, automatically, and in parallel across the visual field, while objects are identified separately and only at a later stage, which requires focused attention.” • tested human subjects, measuring time response of visual attention to cues on screens • Premotor Attention Theory • Rizzolatti (1987) • The idea of attention is directly linked with the same circuitry in humans used in the generation of movements or planned movements of all types.

Background: Theories of Attention More from Joel …

Background: Saliency • Definition • 3b: standing out conspicuously : PROMINENT especially: of notable significance– (Merriam-Webster Dictionary) • “[a measure of] how different a given location is from its surround in color, orientation, motion, depth, etc.” – (Koch & Ullman, 1985)

Background: Attention & Saliency • Koch & Ullman (1985) • first proposed the idea of a “saliency map” drawing from research in the neurobiology field • define the “winner take all” network approach and the “inhibition of return”

Background: Attention & Saliency • Itti & Koch (2000) • Show the concept of a “saliency map” works to shift machine attention to most “salient” area of visual scenes as compared to human test subjects • Apply the Feature-Integration Theory of Attention in building “feature maps” in parallel • Attention is distributed in decreasing order of saliency

Background: Attention & Saliency • Frintrop (2000) • show the concept of a saliency map is not limited to visual sensors but can also be applied to range sensors • Used 3D laser range sensor to extract a “range” image and an “intensity” image • Further extracts orientation and intensity feature maps from each dimension and fuses them to form a saliency map

Background: Attention & Saliency • Questions: • Can we formulate these concepts of attention and awareness into a mathematical framework? • Can we make some connection between attention and control theory?

Case Study • Question: Can these concepts of attention and awareness be incorporated into autonomous robots to sense changes in a known environment in the context of mapping?

Case Study: SLAM • Dynamic Environments • Simultanoues localization and mapping (SLAM) in dynamic environments has been the focus of recent research: • Fox et al. (1999) • Approach is to filter moving objects out and only apply SLAM to the static environment map • Entropy Filter: very closely related to Baldi’s definition of surprise but uses it only to remove data contributing to positive changes in entropy • Distance Filter: filters those sensor measurements with probability larger than some threshold of being shorter than expected

Case Study: SLAM • Dynamic Environments • Wang et al. (2002-2003) • Filters the current map into a stationary object map (SO-map) and a moving object map (MO-map) assuming that: • Measurements can be divided into stationary and moving • Measurements of moving objects and their pose carry no information and can be filtered out • Detection of moving objects done by observing discrepancies between scans • Derives a Bayesian framework for the SLAM with detection-and-tracking of moving objects by building on Fox’s work

Case Study: SLAM • Static Environments • Original problem first posed by (Smith, Self, & Cheeseman, 1990), to which the solution has been shown to exist: • Particle-Filter based approach (Thrun, et al., 1998) • Presents a probabilistic framework for the SLAM problem without assumptions of probability distributions being Gaussian; uses random samples, weighted appropriately, to represent the desired posterior density functions • Kalman-Filter based approach (Dissanayake, et al., 2001) • Applies discrete Kalman Filter techniques to estimate landmark locations and robot pose; shows that all landmark locations become fully correlated and will converge to a lower bound covariance • Multi-robot Kalman-Filter based approach (Roumeliotis, 2002) • Shows that the centralized Kalman Filter estimator can be written in decentralized form, allowing processing on distributed host machines

Case Study: SLAM • Problems with previous approaches • Dynamic environment SLAM • assumption that discrepancies in data are due to changes in the environment • Fox et al. filter dynamic data out and focus only static areas of the map; inherent assumption that dynamic data is uninformative • Wang et al. assumes that all measurements can be separated into either a stationary object measurement or a moving object measurement • Question • Is there a better framework for detecting dynamic changes in the environment?

Case Study: Surprise • Pierre Baldi (2002) • Definition: “... a complimentary way of measuring information carried by the data is to measure the distance between the prior and the posterior. To distinguish it from Shannon's communication information, we call this notion of information the surprise information or 'surprise'”

Case Study: Surprise • Surprise • Idea of “surprise” to be a measure of the difference between what is expected of the data and what is actually said by the data • An alternative to Shannon’s definition of “information”

Case Study: Surprise • Itti & Baldi (2005) • … more to add here… I haven’t read these papers yet

Surprise Example time = {0,…,tk } time = tk+1

Surprise Example • Obvious Question: • P(D) ? P(D|M) ? • Start by using a line based approach to approximate the world and make the assumption that the associated sensor noise is Gaussian, i.e. :

Surprise Example • If we treat P(D|M) as the probability of the expected data given our understanding of the model of the world from t = 0, … ,tk , then P(D|M) becomes: P(D|M)

Surprise Example • If we treat P(D) as the probability of the most recent data measurement at time tk+1, then P(D) becomes: P(D)

Surprise Example • Using Baldi’s equation, surprise yields the following result with the most “surprising” part of the environment corresponding with what was expected: S(D,M)

Surprise • Properties • S(D,M) > 0 : new features in the environment previously not accounted for in the model • S(D,M) < 0 : modeled features of the environment changed or possibly no longer existing

Problem Definition • Can we formulate the concepts of attention and awareness into a mathematical framework using Baldi’s definition of surprise? • (e.g. a “surprise-saliency map” ) • Can we extend Baldi’s definition of surprise in such a way to govern the controls/actions taken by intelligent, autonomous robots? • (e.g. feedback-control using the “surprise-saliency map” )

Approach: Short Term • Dynamic Mapping • Apply Baldi’s definition of surprise to the problem of robot localization and mapping in dynamic indoor and outdoor environments • Develop a general probabilistic approach to calculating “surprise” without assuming a known form of probability density functions • Formulate results into a “surprise-saliency” map where concepts of attention and awareness taken from neurobiology can be applied (e.g. inhibition of return, winner-take-all, top-down approach, bottom-up approach, etc…)

Approach: Long Term

Work to Date: Testbed • Setup • Currently we have 4 fully functional ER1 Robots (Evolution Robotics), each equipped with laser range finders and indoor-GPS units • The interface platform used between hardware and client-codes is Player v1.6.5. • The robot simulator we use is a complimentary interface to Player, known as Stage v2.0.0 • Peer-to-peer communication is made possible over a wireless network via the communication architecture known as “Spread”

Work to Date: Testbed • Localization Methods Wheel OdometryIndoor GPSScanmatching

Work to Date: Testbed • Graphical User Interface

Future Work

Attention, Awareness, and the Computational Theory of Surprise

Attention, Awareness, and the Computational Theory of Surprise

Presentation Transcript

Attention, awareness, ‘noticing’ and foreign language learning

Computational Game Theory

The element of surprise

The Surprise

The Computational Theory of Mind

Computational Game Theory

Computational Insights and the Theory of Evolution

Computational Learning Theory

TOWARDS A CONTROL THEORY OF ATTENTION

Computational Learning Theory

A Computational Theory of Writer Recognition

Approximation Theory and Computational Geometry

Computational Number Theory - traditional number theory

Computational Number Theory

CS311: Computational Theory

The Surprise

Computational Learning Theory and Kernel Methods

Reinforcement Learning’s Computational Theory of Mind

TOWARDS A CONTROL THEORY OF ATTENTION

Computational Awareness

Reinforcement Learning’s Computational Theory of Mind

Computational Awareness