580 likes | 689 Views
Artificial Neural networks for Robot Control. Neural Networks 15/16. Outputs. Why use ANNs for robotics?. Input data. ANN. Sensory Data. Motor outputs. All (!) we need for robot control is some method of transforming sensory input into motor output so why use ANNs?
E N D
Artificial Neural networks for Robot Control Neural Networks 15/16
Outputs Why use ANNs for robotics? Input data ANN Sensory Data Motor outputs • All (!) we need for robot control is some method of transforming sensory input into motor output so why use ANNs? • Argument from existence proof: the most successful adaptive machines we know of have some form of neural network. • While our nets are impoverished imitations of nature this supports the idea that networks ofrelatively simple units can generate adaptive behaviour over time, • If we want to reproduce similar types of adaptivity, it might seem sensible to start from similar types of system.
2. ANNs are extremely flexible, with many ways that architecture can be modified, (changing weights to changing the entire architecture. 3. ANNs are well suited to incorporating mechanisms such as lifetime learning, potentially enabling agents to adapt to changing environments 4. ANNs can take input from a variety of sources, including both continuous and discrete sensor readings, and similarly produce discrete or analog motor outputs 5. Memory can easily be incorporated into the network through retaining activity over time, 6. ANNs are quite robust to noisy input data as would be expected from sensor data from non-trivial environments
Majority of the training methods we have seen are supervised learning methods: implies the existence of a target output for each input pattern This is true of some robot control tasks where desired behaviour is specified precisely eg Case Western cockroach: used supervised techniques to set parameters to produce a particular gait However, consider a robot navigating to a target: don’t necessarily know ‘correct’ trajectory; trajectory will depend on starting conditions; environment could be dynamic; we need output over TIME: inherently difficult to do function approximation over time and also how do we know at which point we went wrong??? Also behaviour may be reliant on sensory data from previous time-steps: ANNs for robot control are dynamical systems changing over time. Training procedures
Also gradient descent techniques require continuously differentiable functions, thus focus is on feedforward fully-connected nets (though limited recurrency is possible) and varying continuous variables (weights) Not always possible to differentiate error term Also consider the change in error from eg removing a node in a network: all or nothing procedure and could be vastly disruptive. VERY bad for gradient descent Can use unsupervised mtehods like reinforcement learning but generally need short trials and arenas where majority of possible inputs are available eg robot football: shooting
Eg GAs, simulated annealing, evolutionary strategies, genetic programming, evolutionary programming etc (as long as Good points: Avoids many of the problems of gradient descent as we only need to know network’s overall performance. Also can incorporate lifetime learning, changes of architecture etc and can work with wide range of network models (as long as genetic operators can be defined): VERY useful as we do not know what types of network are good Here we will focus on GAs (Sussex - and my - bias) NOT a panacea for all search problem ills: introduce a whole host of other problems which we shall explore Not the only approach eg hill climbing, net crawling etc Use Evolutionary Algorithms!
In GA have a population of genotypes which encode potential solutions to the problem Every generation they are all tested on the problem and assigned a score based on their success known as their fitness Offspring are created for the next generation of solutions via recombination between two parents, followed by application of a mutation operator Generate the probability of a genotype being picked as a parent proportional to its fitness (or fitness ranked over the population) Artificial natural selection where traditionally recombination emphasised as the driving force of evolution However recent work has focussed on role of mutation Basic GA
- Initialise population of N genotypes - Evaluate initial genotype fitnesses - Repeat until termination criteria met: - Repeat until N offspring placed in new population: - Two parents P selected probabilistically (proportional to fitness) - Two offspring O created through recombination of P - Offspring O mutated - Offspring O evaluated - Fittest offspring placed in new population - Replace current population with new population There are an enormous number of variations on the canonical genetic algorithm above, many of which blur the distinction between GAs and other evolutionary algorithms
The genetic operators used to create offspring must do 2 things: There must be a significant amount of similarity between the parents and offspring, heredity, so as to allow exploitation of current solution There must be variation in order for evolution to discover new solutions: exploration of nearby areas Operators depend crucially on the solution representation used, cf binary bit-flips vs real –valued Gaussian mutations. Also want to generate viable solutions (cf telecoms networks must be connected) so genetic operators must be matched to the problem at hand Genetic operators: mutation and crossover 111 10000 11101111 11101111 11111111 000 01111 mutation crossover
Solution representation: Encoding schemes Before we design our genetic operators, must first decide how to represent problem solutions GAs use a string of values (binary, real, letters etc) which must be used to encode all the parameters of the network Split into 2 styles (though really a continuum of types): direct and indirect
x1 xn Direct schemes code all parameters directly Eg Take matrix of weight values (0 for no connection), and write out as one long string. Can work well. Grows with network size. Direct Encoding However, can be bad with respect to heredity Eg What if circled bit is ‘good’ bit of network: how can we retain this bit without taking the other nodes Also can have problems if we want networks to grow and shrink …
Often want networks to have the capacity to grow and shrink thus we will have genotypes of different lengths in our search spaces Can cause many problems eg if using direct encoding and basic crossover could get 5x5 matrix crossed with 2x3 matrix at position 19 … Also get problems with connections: if above matrices are crossed at position 5, ie 1st 5 from 2x3 matrix, these weights are now all weights to neuron 1: this is NOT what they were in the 2x3 matrix which has no knowledge of extra nodes of the 5x5 While these are not insurmountable they illustrate the deeper problem that children are unlike parents Problems of variable length genotypes
Can avoid some of these problems by use of indirect encoding schemes Various forms: developmental schemes (where genotype encodes growth process of phenotype), cellular encoding, tree structures and various wild and wacky ideas Can be useful in eg getting heredity or in getting repeated self structure (cf Gruau cellular encoding: could replicate network features n-fold) Often developed task-specifically One problem is that while fitness is evaluated on phenotype, movement of population is in genotype space: extra layer of complexity in working out eg how crossover affects phenotype Indirect Encoding
GasNet encoding scheme: to allow for problems of variable length genotypes and to allow nodes to have similar proerties across neurons have used a spatial connection scheme. In this way, node x will always connect to nodes in the same region of the plane. Also node properties kept together so crossover can’t mess up a node Some Examples 2 styles of encoding a telecoms network: Indirect – genotype encoded sets of 2x2 matrices which represented eigenvalues of dynamical systems ie sets of attractors in 2d space. Connectiosn sent out from attractors and attracted to others to form connectivity. Also had self similarity operator. Result?
Rubbish! Didn’t work very well at all. Chaotic dynamical system so smallest change of genotype could in principle change whole network. No heredity. Other style semi-direct: want to ensure connected network so made basic genotype a minimum spanning tree (ensures there’s a path from evry point to every other in least amount of connections). Then added in extra connections. Result? OK Not sure about heredity with respect to spanning tree, but had nice crossover operator in general: add a connection to child with probability 0.9 if both parents have it, and 0.4 if only one has it. By varying probabilities can get smaller/bigger nets
Need a fitness function to evaluate robot’s performance: bit of a black art Problematic as it is of central importance to evolutionary computation methods; if there is little chance of differentiating between good and bad solutions, the evolutionary process cannot hope to succeed. Basically defines the search space Ideally, there should be smooth paths in the problem space leading to the optimal solutions but in reality may not be possible Basically one should ensure that there is a gradient for evolution to follow and avoid having large local optima (though this process is generally post-hoc ie make fitness function, population gets stuck, design new fitness function which avoids local optimum, population gets stuck again. Repeat ad nauseum till you regret ever criticising wonderful gradient-descent techniques Fitness functions
Noisy fitnesses Also often have noisy fitness, often due to not being evaluated overexactly the same conditions (eg where sample sets for training are potentially huge, so fitness evaluated over some smaller set) EG robot fitnesses are often highly dependent on the initial conditions Alternatively environment could be a source of noise Typically this noise will obscure differences between the fitnesses of neighbouring solutions, reducing the performance of the evolutionary process (although sometimes the noise can be helpful to eg allow populations to escape from local optima, or smooth search space)
Related to evaluation of fitness is the selection applied to solutions If there is not sufficient selective pressure to drive the evolutionary process to better parts of the landscape, much time will be spent evaluating solutions of poor fitness. By contrast, if the selection pressure is too strong the evolutionary process will halt at the first local optimum reached, with little chance of escaping. Highlights the conflict between allowing exploration of the problem space, and exploitation of local regions of the space Selection
As stated earlier, the fitness function defines the search space of the problem we are looking at Search space is N-dimensional where N is (maximum!) length of bit string How we move through the search space is defined by our recombination operators. Search space can therefore be seen to be a connected graph where connected points are those that can be reached by crossover and mutation Also depending on operators will be more likely to get to certain destinations If operators are well designed in terms of heredity should be able to get to all nearby areas of space Search spaces
Often search space viewed as a an N+1 dimensional landscape where extra dimension is the fitness eg Bit string of length 2 gives us nice landscape below Fitness landscapes Can be a useful metaphor (despite Inman’s protestaions) but ONLY if you reject all cosy notions of local maxima and minima Eg GasNet search-space average of 200 dimensions: lots of places to go Also standard mutation operator can in principle take us to any part of the space Also have addition/deletion of nodes: difficult to view as movement Also, noisy fitness: how to define maxima if fitness is a distribution???
If fitness dependent on a non-linear combination of the genotype loci, the genotype is said to be epistatically-linked. Ie individual locus fitnesses are dependent on the context of other loci values and inter-locus interactions This will generally be the case for ANN robot controllers Epistatically-linked genotypes give rise to the two major properties of fitness landscapes thought to influence search dynamics, ruggedness and local optimality. Ruggedness is regarded as similar to fitness noise, where direction to good solutions may be obscured by local noise By contrast, local optimality is typically thought of in more global terms, with landscapes containing numbers of deceptive peaks However there is no rigorous distinction between the two properties Epistasis, ruggedness and local optimality
Search space properties Smooth vs epistatic Global vs local optima Neutrality Other?
Recently much work has gone into analysing neutrality of landscapes (eg RNA, nkp, evolvable hardware and some robotics) ie landscapes where one can move to points of equal fitness: moving along a neutral network Evolution on fitness landscapes with high levels of neutrality is characterised by periods when fitness does not increase (fitness epochs) interspersed by short periods of rapid fitness increase (epochal evolution or punctuated equilibrium) Adaptive evolution on neutral landscapes has shown that populations tend to move to areas of space which have more neutral neighbours ie the neutral evolution of robustness Neutrality may be of use in escaping from (nearly) locally-optimal solutions, but in practice in high dimensional spaces, quite hard to tell if one is moving neutrally or hovering around a local optimum Neutrality
Neuron Models Many types of neuron model used in robotics CTRNNs: based on leaky integrator neuron model from computational neuroscience Spiking models: similar to above but with a spike generated when activation reaches a threshold GasNet models: incorporate an abstrcat notion of a diffusible neuromodulator into an ANN Firing rate based models Etc …..
However, how can we decide what type of neuron model to use for a particular task? • Similarly, how do we know if we have good fitness functions/recombination operators to use in conjunction with our neuron model? • Can use intuition, or try several combinations. But will our results tell us what we want to know about the problem we were working on • Why did a particular neuron/GA combination work well? • Were our intuitions correct? • What are the implications for generating a more successful model?
An Example: GasNet evolution GasNet evolves faster than NoGas over range of reombination schemes … … and mutation rates … and connection architectures … and robotic problems … ??WHY??
Classically neurotransmission is viewed as occurring Point-to-point at the synapse i.e. locally Occurs over a short temporal scale Overriding metaphor is electrical nodes connected by wires Inspiration for standard connectionist ANN GasNets Background
Recently neuromodulatory gases have been discovered (NO, CO, H2S). By far the most studied is NO Small and non-polar freely diffusing Act over a large spatial scale: volume signalling Act over a wide range of temporal scales (ms to years) Modulatory effects Interaction between neurons not connected synaptically Loose coupling between the 2 signalling systems (electrical and chemical) i.e. neurons that are connected electrically are not necessarily affected by the gas and vice versa. new style of ANN? Neuromodulation by nitric oxide (NO)
Inspiration for new form of ANN: GasNets • Node emits 1 of 2 ‘gases’ due to high electrical activity or high gas concentration • Computationally fast, crude diffusion method, but space and time crucial, local processes
Ojt = tanh[kjt(ΣwijOit-1 + Ij) + bj] 3. Gases diffuse through the network and alter slope of transfer functions of other neurons in concentration-dependent manner. 4. Gas 1 increases gain, gas 2 decreases gain.
Analysis of search space properties If networks evolve faster they must, in some sense, be making the space of solutions easier to search in (smoother? More densely packed with good solutions? Less optima? More neutral??) Analysis will hopefully tell us what the search space properties are like and what features of our networks are ‘good’ for search Also can help us to understand the dynamics of an EA search through high-dimensional space: not well understood Hopeful approach since many of the intuitions we have about how EA’s search spaces have come from such analyses Eg work on nk, nkp and royal road landscapes etc have attempted to address the role of neutrality, crossover vs mutation and much more
Properties to examine: Smooth vs rugged Global vs local optima Neutrality Other?
Abstract mathematical landscapes like nk and nkp are generally designed to have tunable ruggedness, neutrality, local modality etc. Real-world problems have no direct link between solution architecture and landscape properties … … And maybe no understandable link between landscape properties and evolutionary dynamics (how does adding virtual gas affect neutrality??? What is neutrality in a noisy space????) We’ve found no explanation for GasNet evolvability in terms of fitness landscape properties (partly because 99.9% of real spaces evaluate to 0 fitness meaning standard measures see them as a homogeneous flat landscape) However…
What about functional analysis? • Oscillator sub-networks very commonly evolve • Node 5 provides electrical stimulation to RMnode, this causes gas1 emission from RMnode, this causes gas2 emission from node 5, this reduces gain of Rmnode which decreases activity which shuts off gas emission from RM and hence from 5 … and cycle begins again ..
Rmnode when gain high Rmnode when Gain low Interaction of chemical and electrical provides transition between 2 regimes with diffusion controlled transition timings
‘timer’ inhibit ‘bright object finder’ Timer sub-circuits naturally become active before object finder circuits
GasNet and NoGas “Timers” (2): GasNet timer = build-up of gas concentration Simple architecture, mechanisms easily tuned? NoGas timer = 3 fully connected nodes. Convoluted architecture, difficult to tune?
Re-evolution in environment with changed time scales (func =)
Re-evolution in environment with changed time scales (sample)
Conclusions • It seems that easy temporal adaptivity is an important feature in GasNet evolvability • Dynamics used in surprising ways (e.g. sensory noise filters), so important even in ‘reactive’ tasks • Evolution and re-evolution of various kinds of rhythmic networks backs this up
But… • … not whole story • Have investigated several GasNet variants which have improve the performance of the original • They have same temporal properties so what is going on there?? • Maybe need to look at more phenotypic properties (eg coupling of gaseous and electrical signalling mechanisms) • To do this must introduce GasNets and variants
Diffusion in Original GasNets Based on the spatial distribution of NO produced by a single spherical neuron 1. Gas cloud centred on emitting node and builds up linearly with time (to a maximum) at a genetically specified rate 2. Gas varies spatially as an inverse exponential: exp(-d2/r).
However, initial gas diffusion model was intentionally simplistic • Cannot capture the rich range of spatio-temporal properties seen in real systems • Therefore decided to develop 2 new versions incorporating aspects of NO signalling seen in nervous systems • Will hopefully lead to more powerful/evolvable robotic systems • Also tests the potential utility of the features in real nervous systems
Mammalian cortical plexus • NO involved in mediating link between neural activity and blood flow • Fibres are too small to generate an effective NO signal individually • NO from many fibres summate: NO signal different to that from single neuron
Fineness of fibres leads to a uniform signal • Combined effect maintains high concentration levels over large volume • Also delay until fibres interact serves to act as a noise filter: Plexuses of fine fibres signal persistent neuronal activity to blood vessels
2. Gas clouds centred in a genetically specified position in the network plane 1. Gas cloud has uniform spatial distribution Plexus GasNet model
Receptor GasNet • In nervous systems, only neurons with one of the receptors for NO will be affected • Each node may have quantities of receptors (none, medium, maximum) • Receptor specific modulations Gas concentration Receptor concentration
Receptor GasNet modulations • Increase gain • Decrease gain • Activation includes a proportion of previous activation • Transfer function switched One that was particularly successful was using only one gas which increased gain