Neural Networks

Neural Networks 10701/15781Recitation February 12, 2008 Parts of the slides are from previous years’ 10701 recitation and lecture notes, and from Prof. Andrew Moore’s data mining tutorials.

Recall Linear Regression • Prediction of continuous variables • Learn the mapping f: X  Y • Model is linear in the parameters w (+ some noise) • Assume Gaussian noise • Learn MLE w =

Neural Network • Neural nets are also models withw parameters in them. They are now called weights. • As before, we want to compute the weights to minimize sum-of-squared residuals • Which turns out, under “Gaussian i.i.d noise” assumption to be max. likelihood. • Instead of explicitly solving for max. likelihood weights, we use Gradient Descent

Perceptrons • Input x=(x1,…, xn) and target value t: or • Given training data {(x(l),t(l))}, find w which minimizes

Gradient descent • General framework for finding a minimum of a continuous (differentiable) function f(w) • Start with some initial value w(1) and compute the gradient vector • The next value w(2)is obtained by moving some distance from w(1) in the direction of steepest descent, i.e., along the negative of the gradient

Gradient Descent on a Perceptron • The sigmoid perceptron update rule

Boolean Functions e.g using step activation function with threshold 0, can we learn the function • X1 AND X２? • X１OR X２? • X１AND NOT X２? • X１XOR X２?

Multilayer Networks • The class of functions representable by perceptron is limited • Think of nonlinear functions:

A 1-Hidden layer Net • Ninput=2, Nhidden=3, Noutput=1

Backpropagation • HW2 – Problem 2 • Output in k-th output unit from input x • With bias: add a constant term for every non-input unit • Learn w to minimize

Backpropagation Initialize all weights Do until convergence 1. Input a training example to the network and compute the output ok 2. Update each hidden-to-output weight wkj by 3. Update each input-to-hidden weight wji by

Neural Networks

Neural Networks

Presentation Transcript

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

NEURAL NETWORKS

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks