100 likes | 325 Views
Multilayer Perceptron. One and More Layers Neural Network. The association problem. ξ - i nput to the network with length N I , i.e., { ξ k ; k =1,2,…,N I } O - output with length N o , i.e., { O i ; i=1,2,…,N o } ς - desired output , i.e., { ς i ; i=1,2,…,N o }
E N D
Multilayer Perceptron One and More Layers Neural Network Building robots Spring 2003
The association problem • ξ - input to the network with lengthNI, i.e., {ξk ; k =1,2,…,NI} • O - output with length No, i.e., {Oi ; i=1,2,…,No} • ς - desired output , i.e., {ςi ; i=1,2,…,No} • w - weights in the network, i.e., wik weight between ξk and Oi • T – threshold value for output unit be activated • g – function to convert input to output values between 0 and 1. Special case: threshold function, g(x)=θ(x)=1 or 0 if x > 0 or not. Given an input pattern ξ we would like the output O to be the desired one ς . Indeed we would like it to be true for a set of p input patterns and desired output patterns ,μ=1, …, p. The inputs and outputs may be continuous or boolean. Building Robots Spring 2003
The geometric view of the weights • For the boolean case, we want , the boundary between positive and negative threshold is defined by which gives a plane (hyperplane) perpendicular to . • The solution is to find the hyperplane that separates all the inputs according to the desired classification • For example: the boolean function AND Hyperplane (line) Building Robots Spring 2003
Learning: Steepest descent on weights • The optimal set of weights minimize the following cost: • Steepest descent method will find a local minima via or where the update can be done each pattern at a time, h is the “learning rate”, , and Building Robots Spring 2003
Analysis of Learning Weights • The steepest descent rule produces changes on the weight vector only in the direction of each pattern vector . Thus, components of the vector perpendicular to the input patterns are left unchanged. If is perpendicular to all input patterns, than the change in weight will not affect the solution. • For , which is largest when is small. Since , the largest changes occur for units in “doubt”(close to the threshold value.) 1 0 Building Robots Spring 2003
Not a solution Limitations of the Perceptron • Many problems, as simple as the XOR problem, can not be solved by the perceptron (no hyperplane can separate the input) Building Robots Spring 2003
Multilayer Neural Network • - input of layer L to layer L+1 • - weights connecting layer L to layer L+1. • – threshold values for units at layer L Thus, the output of a two layer network is written as The cost optimization on all weights is given by Building Robots Spring 2003
Layer L=0 Layer L=1 Properties and How it Works • With one input layer, one output layer, and one or more hidden layers, and enough units for each layer, any classification problem can be solved • Example: The XOR problem: 0 1 Layer L=2 • Later we address the generalization problem (for new examples) Building Robots Spring 2003
Learning: Steepest descent on weights Building Robots Spring 2003
Learning Threshold Values Building Robots Spring 2003