170 likes | 256 Views
CS344: Introduction to Artificial Intelligence. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 17: Feedforward network: non linear separability 9 th Feb, 2012. # of Threshold functions n # Boolean functions (2^2^n) #Threshold Functions (2 n2 ) 1 4 4 2 16 14
E N D
CS344: Introduction to Artificial Intelligence Pushpak BhattacharyyaCSE Dept., IIT Bombay Lecture 17: Feedforwardnetwork: non linear separability 9th Feb, 2012
# of Threshold functions • n # Boolean functions (2^2^n)#Threshold Functions (2n2) • 1 4 4 • 2 16 14 • 3 256 128 • 64K 1008 • Functions computable by perceptrons - threshold functions • #TF becomes negligibly small for larger values of #BF. • For n=2, all functions except XOR and XNOR are computable.
#regions produced by n hyperplanes in d dimensions determines the computing power of the perceptron
Limitations of perceptron • Non-linear separability is all pervading • Single perceptron does not have enough computing power • Eg: XOR cannot be computed by perceptron
Solutions • Tolerate error (Ex: pocket algorithm used by connectionist expert systems). • Try to get the best possible hyperplane using only perceptrons • Use higher dimension surfaces Ex: Degree - 2 surfaces like parabola • Use layered network
Pocket Algorithm • Algorithm evolved in 1985 – essentially uses PTA • Basic Idea: • Always preserve the best weight obtained so far in the “pocket” • Change weights, if found better (i.e. changed weights result in reduced error).
Linear Separability(Single Hyper-planes) (0,1) +ve (1,1) -ve Hyper-plane 2 Hyper-plane 1 Line 1 better than Line 2, Since has less error +ve +ve (0,0) -ve (1,0) +ve
Parabolic Surface Separation (0,1) +ve (1,1) -ve +ve +ve (0,0) -ve (1,0) +ve
Linear Surface Separation(Multiple Hyper-planes) (0,1) +ve (1,1) -ve Hyper-plane 2 Hyper-plane 1 +ve +ve (0,0) -ve (1,0) +ve
Multilayer Perceptron for XOR (1/2) y Required: x1 = 0, x2 = 0 y = 0 x1 = 1, x2 = 1 y = 0 x1 = 0, x2 = 1 y = 1 x1 = 1, x2 = 0 y = 1 x1 x2
Multilayer Perceptron for XOR (2/2) y At (0,1): P1 y1 = 1 P2 y2 = 1 At (0,0): P1 y1 = 0 P2 y2 = 1 Y2 Y1 h2 h1 At (1,1): P1 y1 = 1 P2 y2 = 0 At (1,0): P1 y1 = 1 P2 y2 = 1 y = y1 AND y2 y1 = X1 OR X2 y2 = X1 NAND X2 x1 x2
XOR using 2 layers: Another way • Non-LS function expressed as a linearly separable • function of individual linearly separable functions.
Example - XOR • = 0.5 w1=1 w2=1 Calculation of XOR x1x2 x1x2 Calculation of x1x2 • = 1 w1=-1 w2=1.5 x2 x1
Example - XOR • = 0.5 w1=1 w2=1 x1x2 1 1 x1x2 1.5 -1 -1 1.5 x2 x1
Some Terminology • A multilayer feedforward neural network has • Input layer • Output layer • Hidden layer (asserts computation) Output units and hidden units are called computation units.
Training of the MLP • Multilayer Perceptron (MLP) • Question:- How to find weights for the hidden layers when no target output is available? • Credit assignment problem – to be solved by “Gradient Descent”