270 likes | 299 Views
Towards neuro-robotic models of internal simulation of perception. Tom Ziemke Department of Computer Science University of Skövde, Sweden tom@ida.his.se. Collaborators. University of Lund (neurophysiology) Germund Hesslow Dan-Anders Jirenhed (experiments)
E N D
Towards neuro-robotic models of internal simulation of perception Tom Ziemke Department of Computer Science University of Skövde, Sweden tom@ida.his.se
Collaborators • University of Lund (neurophysiology) • Germund Hesslow • Dan-Anders Jirenhed (experiments) • University of Skövde (comp/cog sci) • Daniel Hjelm • Henrik Svensson
Inner Worlds • Introspection tells us that we can ‘have’ sensory experiences in the absence of external stimuli • Lee & Thompson (1982) investigated this experimentally with subjects who • were allowed to look at their surroundings first, • then were asked to, e.g., blindly walk to a certain location or throw objects at targets in the room • humans seem to have an ‘inner world’ that allows to anticipate/simulate sensory experiences and the consequences of actions
Simulation Hypothesis (Hesslow, 1994, 2002) • The organism can simulate long chains of actions and perceptions without any external input.
Experiments on simulation of perception • Aim: a minimal model • Evolution of a neural network controller for a (simulated) Khepera robot which should internally simulate its perception of the world, such that it can • ‘see’ the world without access to sensor input (analogous to sensory imagery) • use its simulated perception to act viably in the world (analogous to the behaviour of subjects in Lee and Thompson, 1982) • Two sets of experiments will be presented
predicted sensors predicted sensors motors motors feedback feedback sensors context context The Basic Idea time step t+1 control architecture time step t
Related Experimental Work • Meeden et al. (1993), Meeden (1996) • showed that additional sensor prediction learning had a positive effect on behavior learning • but did not analyze or use those sensory predictions • Chrisley (1992, 1993, 1995) • Connectionist Navigational Map • Tani & Nolfi (1999) • multilevel architecture consisting of several networks • Gross et al. (1999) • showed that anticipation improved behavior • complex neuroscience-inspired architecture
3 4 2 5 1 6 L R 7 8 Experimental setup • simulated robot in two environments • using Meeden’s NN architecture • task: collision-free motion • trained using a genetic algorithm • experiments • behavior only • behavior and prediction • ‘blindfolded’ behavior
Experiment 1 - Results • no sensor prediction • standard GA: 8 bits per weight, 150 individuals, 30 selected, mutation only, fitness function rewards straight motion, each individual evaluated for 2 X 500 time steps • correct behavior evolves fairly quickly in both worlds
Experiment 2 • additional sensor prediction, one time step ahead • similar GA, but with two-stage fitness selection: • first selected 60 best behaving individuals (of 150) • then the 30 best predicting individuals (of above 60) • tried two prediction fitness functions, weighting sensors differently, but with similar results
S F Experiment 2 - Results • Prediction captured most frequently active sensors Real sensors Sensor prediction
2 3 4 1 S 5 6 7 8 9 10 11 Experiment 3 • robots from experiment 2 were ‘blindfolded’, and tested for their capacity for acting based on their predictions • robots often fail to predict the relevant dynamics
New experimental setup • simpler environment • long-range sensor • detecting rods (red), not walls (black) • more continuously changing sensory values
Long-range rod sensor • Only detects ‘rods’ – no walls or any other obstacles. • Similar to a one-dimensional line camera, but sensitive to distance as well as angle. • activation between 0 at maximum range and 1 at zero distance • Long sensor range gives continuously changing sensor activations.
Sensor prediction Motor output Sensor activation Modular architecture • Two modules • motor control / behavior • sensor prediction • Modules were trained separately • first behaviour • then prediction
Training • Same fitness function, rewarding collision-free motion, is used for evolution of both the behaviour module and the prediction module • sensor prediction quality (fitness) is now measured only indirectly, observing its effect on overt behaviour during simulated perception • i.e. instead of training robots to predict well (as in the previous experiments), we train them to behave well when blindfolded
Some experimental details • Evaluation of simulated perception (overt behaviour): • 10 time steps with sensor information • 290 time steps of internally simulated perception • Rod sensor input • angle: 30 degrees, maximum distance: 310 mm • Connection weights • real values between -10 and +10 • Gaussian distribution of weight mutation with variance=0.5 • 68% of mutation effects between -1.5 and +1.5 • Genetic Algorithm: • population size: 200 individuals • elitist selection procedure • 10,000+ generations
Some Observations • Different conditions and architectures were tested • Different behavioural strategies could be observed, in particular different ways of approaching the rods: • straight approach trajectories, or • approaching the rods at a greater angle, almost passing by before turning the corner • Only the second case led to successful internal ‘simulation’ and blind behavior
i1 i2 INPUT i3 i4 i5 i6 i7 i8 i9 i10 h1 HIDDEN h2 h3 h4 h5 LM OUT RM Behavior with real sensor input
i1 i2 i3 i4 i5 i6 INPUT i7 i8 i9 i10 h1 h2 h3 h4 h5 LM RM H I DDEN OUT Real vs. ’simulated’ sensors
i1 INPUT i2 i3 i4 i5 i6 i7 i8 i9 i10 HIDDEN h1 h2 h3 h4 h5 OUT LM RM (Un)successful behavior / ’simulation’?
Results / Observations • ‘Prediction’ / ‘simulation’ is sufficiently developed to control ‘blind’ overt behaviour • in a simple environment • Simulated sensory input looks pretty different from the real sensor input • which is not what we would have expected from a ‘simulation’, • but it generates practically the same motor output and thus behaves correctly (which it was trained on), and • there are more similarities at the hidden node level
Summary • In the first set of experiments we trained robots to predict sensory values accurately • blind behavior based on internal simulation failed • In the second set of experiments we trained robots to behave correctly with and without sensory input • blind behavior succeeded • but the ‘simulation’ doesn’t look like the expected simulation of the sensory inputs • but perhaps that expectation was naïve anyway? • and, after all, in humans we would not expect a simulated visual stimulus to occur in the eye …
Open Questions & Future Work • Maybe our minimal models are too minimal … • Evaluation of alternative actions / sequences • Simulation at different levels of abstraction • e.g. using an architecture like Tani & Nolfi’s (1999) • How do you pick the right level, or adjust it? • More complete system integrating different systems of simulation / anticipation • cf. Grush’s (BBS) emulation theory of representation • Exactly what is the role of simulation / emulation? • ‘representation’ or thought?
Tani & Nolfi (1999) • A candidate architecture for simulation at multiple levels of abstraction? Predict active RNN at lower level RNN 1 RNN 2 RNN 3 RNN 4 RNN 5 Predict sensor and motor state at next time step RNN 1 RNN 2 RNN 3 RNN 4 RNN 5 sensor and motor state