CSE 473 Introduction to Artificial Intelligence Neural Networks

CSE 473Introduction to Artificial IntelligenceNeural Networks Henry Kautz Spring 2006

Training a Single Neuron • Idea: adjust weights to reduce sum of squared errors over training set • Error = difference between actual and intended output • Algorithm: gradient descent • Calculate derivative (slope) of error function • Take a small step in the “downward” direction • Step size is the “training rate” • Single-layer network: can train each unit separately

Gradient Descent

Computing Partial Derivatives

Single Unit Training Rule Adjust weight i in proportion to… • Training rate • Error • Derivative of the “squashing function” • Degree to which input i was active

Sigmoid Units

Sigmoid Unit Training Rule Adjust weight i in proportion to… • Training rate • Error • Degree to which output is ambiguous • Degree to which input i was active

Expressivity of Neural Networks • Single units can learn any linear function • Single layer of units can learn any set of linear inequalities (convex region) • Two layers can learn any continuous function • Three layers can learn any computable function

Character Recognition Demo

BackProp Demo 1 • http://www.neuro.sfc.keio.ac.jp/~masato/jv/sl/BP.html • Local version: BP.html

Backprop Demo 2 • http://www.williewheeler.com/software/bnn.html • Local version: bnn.html

Modeling the Brain • Backpropagation is the most commonly used algorithm for supervised learning with feed-forward neural networks • But most neuroscientists believe that brain does not implement backprop • Many other learning rules have been studied

Hebbian Learning • Alternative to backprop for unsupervised learning • Increase weights on connected neurons whenever both fire simultaneously • Neurologically plausible (Hebbs 1949)

Self-Organizing Maps • Unsupervised method for clustering data • Learns a “winner take all” network where just one output neuron is on for each cluster

Why “Self-Organizing”

Recurrent Neural Networks • Include time-delay feedback loops • Can handle temporal data tasks, such as sequence prediction

CSE 473 Introduction to Artificial Intelligence Neural Networks

CSE 473 Introduction to Artificial Intelligence Neural Networks

Presentation Transcript

CSE 473 Artificial Intelligence

CSE 473: Artificial Intelligence Spring 2012

CSE-473 Artificial Intelligence

CSE 473: Artificial Intelligence

CSE 473: Artificial Intelligence Spring 2012

CSE 473: Artificial Intelligence

Introduction to Artificial Neural Networks

CSE 473: Artificial Intelligence Spring 2012

CSE-473 Artificial Intelligence

CSE 473: Artificial Intelligence Spring 2012

CSE 473: Artificial Intelligence Autumn 2011

CSE 473: Artificial Intelligence Autumn 2011

CSE 473: Artificial Intelligence Fall 2014

CSE 473: Artificial Intelligence

CSE 473: Artificial Intelligence Autumn 2010

CSE 473: Artificial Intelligence Autumn 2014

CSE 473: Artificial Intelligence Autumn 2011

CSE 473: Artificial Intelligence Autumn 2011

CSE 473: Artificial Intelligence Spring 2012

CSE 473: Artificial Intelligence Autumn 2011

CSE 473: Artificial Intelligence Spring 2012

CSE 473: Artificial Intelligence Autumn 2011