290 likes | 444 Views
CSE 473 Introduction to Artificial Intelligence Neural Networks. Henry Kautz Spring 2006. Training a Single Neuron. Idea: adjust weights to reduce sum of squared errors over training set Error = difference between actual and intended output Algorithm: gradient descent
E N D
CSE 473Introduction to Artificial IntelligenceNeural Networks Henry Kautz Spring 2006
Training a Single Neuron • Idea: adjust weights to reduce sum of squared errors over training set • Error = difference between actual and intended output • Algorithm: gradient descent • Calculate derivative (slope) of error function • Take a small step in the “downward” direction • Step size is the “training rate” • Single-layer network: can train each unit separately
Single Unit Training Rule Adjust weight i in proportion to… • Training rate • Error • Derivative of the “squashing function” • Degree to which input i was active
Sigmoid Unit Training Rule Adjust weight i in proportion to… • Training rate • Error • Degree to which output is ambiguous • Degree to which input i was active
Expressivity of Neural Networks • Single units can learn any linear function • Single layer of units can learn any set of linear inequalities (convex region) • Two layers can learn any continuous function • Three layers can learn any computable function
BackProp Demo 1 • http://www.neuro.sfc.keio.ac.jp/~masato/jv/sl/BP.html • Local version: BP.html
Backprop Demo 2 • http://www.williewheeler.com/software/bnn.html • Local version: bnn.html
Modeling the Brain • Backpropagation is the most commonly used algorithm for supervised learning with feed-forward neural networks • But most neuroscientists believe that brain does not implement backprop • Many other learning rules have been studied
Hebbian Learning • Alternative to backprop for unsupervised learning • Increase weights on connected neurons whenever both fire simultaneously • Neurologically plausible (Hebbs 1949)
Self-Organizing Maps • Unsupervised method for clustering data • Learns a “winner take all” network where just one output neuron is on for each cluster
Recurrent Neural Networks • Include time-delay feedback loops • Can handle temporal data tasks, such as sequence prediction