230 likes | 245 Views
Added After Talk. Looking for a review paper on evolving plastic networks? Here is a recent one from Andrea Soltoggio, Sebastian Risi, and Kenneth Stanley: https://arxiv.org/abs/1703.10371. Evolving to Learn through Synaptic Plasticity. Kenneth O. Stanley Uber AI Labs And
E N D
Added After Talk • Looking for a review paper on evolving plastic networks? Here is a recent one from Andrea Soltoggio, Sebastian Risi, and Kenneth Stanley:https://arxiv.org/abs/1703.10371
Evolving to Learn through Synaptic Plasticity Kenneth O. Stanley Uber AI Labs And Evolutionary Complexity Research Group, Department of Computer Science, University of Central Florida kstanley@uber.com kstanley@cs.ucf.edu
Evolution and Plasticity • Brains in nature are evolved • But not static: brains learn over their lifetime • If neuroevolution is about solving a problem, static might be okay • If neuroevolution is about evolving a brain • Plasticity may be essential • (Though recurrence alone is also a theoretical option)
Evolution Can Discover the Delta Rule (1990)… Chalmers, David J. "The evolution of learning: An experiment in genetic connectionism." In Proceedings of the 1990 connectionist models summer school, pp. 81-90. San Mateo, CA, 1990.
But “Local Learning Rules” Are Perhaps More Interesting • The delta rule and backprop are already known • Not entirely clear whether backprop is biologically plausible • Biggest one: Domain specific learning rules are likely more efficient though less generic
But “Local Learning Rules” Are Perhaps More Interesting • The delta rule and backprop are already known • Not entirely clear whether backprop is biologically plausible • Biggest one: Domain specific learning rules are likely more efficient though less generic Hinton 2014 slide on backprop in the cortex
But “Local Learning Rules” Are Perhaps More Interesting • The delta rule and backprop are already known • Not entirely clear whether backprop is biologically plausible • Biggest one: Domain specific learning rules are likely more efficient though less generic
What Can Happen at a Synapse? (from Dr. George Johnson at http://www.txtwriter.com/Backgrounders/Drugaddiction/drugs1.html )
What Can Happen at a Synapse? • Weighted signal transmission • But also: • Strengthening • Weakening • Sensitization • Habituation • Hebbian learning • Neuromodulation (Soltoggio et. al 2008)
How Should Weights Change? (Blynel and Floreano 2002) • Plain Hebb Rule: • Postsynaptic rule: • Weakens synapse if postsynaptic node fires alone • Presynaptic rule: • Covariance rule: Strengthens when correlated, weakens when not
Experiment: Light-switching • Task: Go to black area to turn on light, then go to area under light • Requires a policy change in mid-task: Reconfigure weights for new policy Fully Recurrent Network Blynel, J. and Floreano, D. (2002) Levels of Dynamics and Adaptive Behavior in Evolutionary Neural Controllers. In B. Hallam, D. Floreano, J. Hallam, G. Hayes, and J.-A. Meyer, editors. From Animals to Animats 7: Proceedings of the Seventh International Conference on Simulation on Adaptive Behavior, MIT Press.
Results • Adaptive synapse networks evolved straighter and faster trajectories • Rapid and appropriate weight modifications occur at the moment of change • However, other early experiments (e.g. dangerous food foraging with NEAT) showed recurrence alone doing better • Still, almost surely plasticity matters
Soltoggio et al. (2008): Neuromodulated Plasticity Regular activation • Advantage: Knowing when to change • Magnitude of change modulated by external neuron • Moving towards RL-like capabilities Modulatory activation Plasticity term (can be any learning rule) Weight change
Neuromodulation Experiment: T-maze • Position of reward can change across trials • Modulatory plastic networks perform better than simple plastic networks on harder T-maze • Memory “lock-in” happens Double T-maze
Interesting New Idea (Soltoggio and Stanley 2012): Reconfigure-and-saturate Hebbian Plasticity • Is it possible to make a Hebbian network learn new behaviors based on reward or penalty signals? • Yes, given some unusual ingredients: • A modulation signal represents the reward • Weight saturation • Neural noise Stronger weight (from random start) will win Positive modulation yields Hebbian plasticity, But negative modulation yields anti-Hebbian (decreases strengh of pathway by reducing weight difference)
Reconfigure-and-saturate Dynamics • Neural Noise determines the winner during Hebbian phases • Insight: Noise is driving exploration • Saturation allows stability
Result: R&S Intelligent Navigation Experiment • Learns over 2,000 timesteps to navigate intelligently Relearns after reward switch What’s next: Temporal association learning through eligibility traces
Plasticity through HyperNEAT • Adaptive HyperNEAT (Risi and Stanley 2012): Indirect encoding compactly generates pattern of rules across NN Imagine a pattern of rules spread across the brain as complex as this picture
And See Tomorrow: Backpropagated plasticity: learning to learn with gradient descent in large plastic neural networksThomas Miconi, Jeff Clune, Kenneth Stanley • At the Workshop on Meta-Learning on Saturday: Poster Spotlights start 9:40am • Idea: Plasticity parameters optimized by gradient descent between “lifetimes” • Weights then adjusted according to plasticity rules within life • Results: Plastic networks with millions of parameters become better at image reconstruction task than plain RNN or LSTM
And See Tomorrow: Backpropagated plasticity: learning to learn with gradient descent in large plastic neural networksThomas Miconi, Jeff Clune, Kenneth Stanley • At the Workshop on Meta-Learning on Saturday: Poster Spotlights start 9:40am Learned plasticity coefficients
Conclusion • Wide scope for creativity • Endless kinds of plasticity can be evolved • Domain-specific learning mechanisms are less studied than domain-general • Could be important in some domains, e.g. learning to walk quickly on new terrain or in new gravity • Cross-pollination between NE and DL
More information • My Homepage: http://www.cs.ucf.edu/~kstanley • Uber AI Labs: http://uber.ai • Evolutionary Complexity Research Group: http://eplex.cs.ucf.edu • Email: kstanley@cs.ucf.edu kstanley@uber.com