370 likes | 486 Views
Holland and Goodman – Caltech – Banbury 2001.
E N D
When an autonomous embodied system, with a difficult animal-like mission in a difficult environment, has a sufficiently high level of intelligence (i.e. is able to achieve that mission well), then it may exhibit consciousness, either as a necessary component for achieving the mission, or as a by-product. Holland and Goodman – Caltech – Banbury 2001
5.5 cm A Simple Robot • The Khepera miniature robot • Features • 8 IR sensors which allow it to detect objects • two independently controlled motors.
Webots – Khepera Embodied Simulator Simulators allow faster operation than real robots – particularly if learning involved. Simlulator complexity is OK for a simple robot like the Khepera, but for more complex robots, the simulator may be too complex or not simulate the real word accurately.
A Generic Robot Controller Architecture HIDDEN UNITS Sensory Inputs Including Motors and effectors Controller outputs to motors and effectors OUTPUT UNITS INPUT UNITS Rec ur r ent STATE Neural UNITS Machine • The controller of the robot is an artificial neural network with recurrent feedback, capable of forming internal representations of sensory information in the form of a neural state machine. • Sensory inputs (vision, sound, smell, etc) from sensors are fed to this structure • Sensory inputs also include feedback from the motors and effectors. • Controller outputs drive the locomotion and manipulators of the robot. • The neural controller learns to perform a task, using neural network and genetic algorithm techniques. • But - the internal model of the controller is implicit and therefore hidden from us.
Sensory inputs, including feedback from motors & effectors Understanding the internal model HIDDEN UNITS • Introduce a second recurrent neural network, separate from the first system, which learns the inverse relationship between the internal activity of the controller and the sensory input space Motor & effector drive outputs OUTPUT UNITS INPUT UNITS Recurrent STATE Neural UNITS Machine Outputs of inverse in same sensory space as inputs of forward controller OBSERVE INVERSE Recurrent Neural Machine • This mechanism will allow us to represent the hidden internal state of the controller in terms of the sensory inputs that correspond to that state. • Thus we may claim to know something of “what the robot is thinking”. • We assume that the controller be learned first, and that, once this is learned and reasonably stable, the inverse can be learned.
Simplified Inverse • In this experiment, we utilize a controller model which is much less powerful than the recurrent controllers described above, but allows us to illustrate the principle, and in particular makes “inversion” of the forward controller extremely simple. • The crucial simplification we make is that the controller will learn its representation directly in the input space. Thus there is no inverse to learn - the internal representation learned by the robot is directly visible as an input space vector. • The first phase is to learn or program the forward model or robot controller. In this simple experiment we program in a simple reactive wall-following behavior, rather than learn a complex behavior. The robot starts with no internal model, and adaptively learns its internal representation in an unsupervised manner as it performs its wall following behavior.
The Learning Algorithm(based on Linaker and Niklasson 2000 ARAVQ algorithm) • A 10-dimensional feature space is formed from the 8 Khepera IR sensor signals plus the 2 motor drive signals. • Clusters feature-vectors by change detection, to form prototype feature vector “models”. • Unsupervised • Adds new models based on two criteria: • Novelty: Large distance from existing models • Stability: Low variance in buffered history of features • Adapts existing models over time • We program in a simple “wall following” behavior to act as a “teacher”.
Learning in action Colors show learned concepts: Black – right wall Blue – ahead wall Green – 45 degree right wall Red – corridor Light Blue – outside corner
Running with the model • Switch off the wall follower • The robot “sees” features as it moves • Choose the closest learned model vector at each tick • Use the model vector motor drive values to actually drive the motors.
Running with the model Color indicates which is the current “best”model feature
Invert the motor signals back to sensory signals to infer an egocentric “map” of the environment as “seen” by the robot.
Keeping it Real • Mapping with the real robot
Manipulating the model “mentally” to make a decision - “planning” • Take the sequence of learned model feature vectors and cluster sub –sequences into higher-level concepts • For example: • Blue-Green-black = Left Corner • Red = Corridor • Black = right wall • At any instant ask the robot to go to “home” • Run the model forwards mentally to decide if it is shorter to go ahead or to go back • Take appropriate action
Decision Time Corridor corner is home Rotate = Home is behind me Flash LED’s = Home is ahead of me
CONTROLLER Switch Switch Real World Sense Signals Motor Signals To Real Robot INVERSE Model World Sense Signals Inverse Predictor Architecture • We now allow the inverse to be fed back into the controller via the switch • Thus the controller has an image of its internal hidden state or “self” in the same feature space as its real sensory inputs • Thus it can “see” what it “itself” is thinking. • As before “we” can also observe what the machine is “thinking”.
Consequences of the architecture • In “normal” mode - the controller is producing motor signals based on the sensory input it “sees” (including motor/effector feedback). Normally we expect to see what it is seeing. The inverse allows for detecting mismatch between a predicted and an actual sensory input – thus indicating a novel experience, which in turn could focus attention and learning in the main controller. Noisy, ambiguous, and partial inputs can be “completed”. • In “thinking or planning” mode the real world is disconnected from the controller input, and the mental images being output by the inverse are input to the controller instead. Thus sequences of planned action towards a goal can take place in mental space, and executed as action. Note that by switching between normal mode and “thinking” mode in some way, we can emulate the robot doing both reactive control and thinking at the same (multiplexed really) time. That is, like humans do when driving a car on “automatic” while “thinking” of something else. • In “sleeping” mode we shut off the sensory input and allow noise to be input. Then the inverse will output “mental images”, which themselves can be fed back into the input (because they have the same representation) producing a complex series of “imagined” mental images or “dreams”. Note that we can use this “sleeping” mode to actually learn (or at least update) the inverse. The input noise vector is a “sensory input” vector like any other (whether it is structured accordingly or not), thus the inverse should be able to output this vector like any other from the state and motor signals. Thus we can use the error to update the inverse. • If we do not disconnect the motors during “dreaming” we will have “sleepwalking” or “twitching”. If we assume that the controller is continually learning, then the inverse must be continually updated. If they get too much out of synchronization we could get irrational sequences in “thinking” or worse in execution mode - an analog of “madness”.
Head:2 degrees of freedomBody:2 degrees of freedomArms:4 degrees of freedom (x2)Legs:6 degrees of freedom (x2)(Total of 24 degrees of freedom) Where’s the Consciousness? • Not there yet • More complex robots • More complex environments • More complex architecture SONY DREAM ROBOT
Increasing complexity Environment Agent Fixed environment Movable body Moving objects More sensors Movable objects Effectors Objects with different values Articulated body Other agents – prey Metabolic state Other agents – predators Acquired skills Other agents – competitors Tools Other agents – collaborators Imitative learning Other agents – mates Language Etc Etc
Multi-stage planning At each step: - what actions could it take? - what actions should it take? - what actions would it take? The planning system needs - a good and current model of the world - a good and current model of the agent’s abilities, expressible in terms of their effects on the model world - an associated executive system to use the information generated by the planning system
A framework? Updates Self Model To executive Updates Environment Model
Speculation… There may be something it is like to be such a self-model linked to such a world model in a robot with a mission