570 likes | 737 Views
CSNB234 ARTIFICIAL INTELLIGENCE. Chapter 10 Artificial Neural Networks (ANN). (Chapter 11, pp. 458-471, Textbook) (Chapter 18, Ref. #1). Instructor: Alicia Tang Y. C. What is Neural Network?. Neural Networks are a different paradigm for computing:
E N D
CSNB234ARTIFICIAL INTELLIGENCE Chapter 10 Artificial Neural Networks (ANN) (Chapter 11, pp. 458-471, Textbook) (Chapter 18, Ref. #1) Instructor: Alicia Tang Y. C. UNIVERSITI TENAGA NASIONAL
What is Neural Network? • Neural Networks are a different paradigm for computing: • Neural networks are based on the parallel architecture of animal brains. • It is a model that simulate a biological neural network • Real brains, however, are orders of magnitude more complex than any artificial neural network so far considered. UNIVERSITI TENAGA NASIONAL UNIVERSITI TENAGA NASIONAL 2
Artificial Neural Networks • Supervised Learning • The Perceptron • Multilayer Neural Networks that use a backpropagation learning algorithm • The Hopfield network • Stochastic network • Unsupervised Learning • Hebbian Learning • Competitive Learning • Kohonen Network (SOM) UNIVERSITI TENAGA NASIONAL
SUPERVISED LEARNING INPUT ANN OUTPUT Feedback loop EXPECTED OUTPUT ERROR HANDLER UNIVERSITI TENAGA NASIONAL
UNSUPERVISED LEARNING INPUT Unsupervised learning program OUTPUT The learning programs will adjust themselves to figure out what could be the output. There is no targets to match, whatsoever UNIVERSITI TENAGA NASIONAL
A Schematic of a Neuron UNIVERSITI TENAGA NASIONAL
Neural network at the first glimpse • Neuron • A cell body consists of many dendrites • A single branch is called an axon • It is the information processor • dendrites handle inputs - receive signals • soma does processing • axon holds output • Neurons are connected by Synapses • synapses are modelled by (adjusting) weights - point of contact between neurons UNIVERSITI TENAGA NASIONAL
What is in a Neural Network? • The model consists of artificial neurons (processing elements or parameters) • they are called nodes • depends on hardware or software implementation • All neurons are connected in some structure that form a “network” look, i.e. neurons are interconnected • A neural network usually operates in parallel • parallel computation • doing multiple things at the same time. UNIVERSITI TENAGA NASIONAL
What’s Special in a Neural Network? • Its computing architecture is based on: • large number of relatively simple processors • operating in PARALLEL • connected to each other by link system UNIVERSITI TENAGA NASIONAL
How does the artificial neural network model the brain? • An artificial neural network consists of a number of interconnected processors. • These processors are made very simple; which are analogous to biological neurons in the human brain. • The neurons are connected by weighted links passing signals from one neuron to another. • Each neuron receives a number of signals, and it produces only one output signal through its connection. • The outgoing connection, in turn, splits into a number of branches that transmit the same signal. • The outgoing branches terminate at the incoming connections of other neurons in the network. UNIVERSITI TENAGA NASIONAL
Why Neural Network Computing? • To model and mimic certain processing capabilities of our brain. • It imitates the way a human brain works, learns, etc. UNIVERSITI TENAGA NASIONAL
A Neural Network Model • Consists of • Input units xi • Weight from unit i wi • An activation level a • A threshold • A network topology • A learning algorithm Real numbers UNIVERSITI TENAGA NASIONAL
Neural Network with Hidden Layer(s) UNIVERSITI TENAGA NASIONAL
Perceptrons Learn by Adjusting Weights UNIVERSITI TENAGA NASIONAL
An example of the use of ANN UNIVERSITI TENAGA NASIONAL
THE PERCEPTRON (Single Layer Neural Network) UNIVERSITI TENAGA NASIONAL
Perceptron • Developed by Frank Rosenblatt (1958). • Its learning rule is superior than the Hebb learning rule. • Has been proven by Rosenblatt that the weights can converge on particular applications. • However, the Perceptron does not work for nonlinear applications as proven by Minsky and Papert (1969). • Activation function used is the binary step function with an arbitrary, but fixed threshold. • Weights are adjusted by the Perceptron learning rule. UNIVERSITI TENAGA NASIONAL
A Perceptron • Is a simple neural network 1 2 : : n Input units Output unit Given that • Input unit xi • Weight from unit i wi • Activation level a • Threshold UNIVERSITI TENAGA NASIONAL
Threshold Function used by Perceptron n a = 1if wi xi i=1 a = 0,otherwise (1) A unit as being ‘on’ or ‘active’, if activation level is ‘1’. UNIVERSITI TENAGA NASIONAL
Perceptron Threshold Function UNIVERSITI TENAGA NASIONAL
A Perceptron that learns “AND” and “OR” concepts: 1.5 0.5 AND- function OR- function 1 1 1 1 1 1 1 1 Each has two inputs Weights shown next to the arcs/links Threshold, is shown next to the output UNIVERSITI TENAGA NASIONAL
The perceptron will have its output ‘on’ iff x1.1 + x2.1 1.5 ---- using (1) Perceptron learns by repeatedly testing on adjustable ‘weights’ through repeated presentation of examples P Q P AND Q ----------------------------------------- 1 1 1 1 0 0 0 1 0 0 0 0 x1 x2 UNIVERSITI TENAGA NASIONAL
A more abstract characterisation • We view inputs x1, x2, … xn to a perceptron as vectors in n-dim space • Since activation levels are restricted to 1 or 0, all input vectors will lie on the corner of a hypercube in this space • We may view weights and threshold as defining a hyperplane satisfying the equation: • w1x1 + w2x2 + …. + wnxn - = 0 UNIVERSITI TENAGA NASIONAL
Geometric Interpretation • Input vectors are classified according to which side of the hyperplane they fall on • This is termed as Linear Discrimination • e.g. four possible inputs are fall on vertices of a square • w1x1 + w2x2- = 0 • defines a line in the plane UNIVERSITI TENAGA NASIONAL
Linear Discrimination • E.g. ax1 + bx2 - c = 0 (straight line) ax1 + bx2 - c 0 (1 side of straight line) >= 0 <=0 ax1 + bx2 - c = 0 UNIVERSITI TENAGA NASIONAL
Perceptron cannot compute XOR function (I) No straight line(s) can be drawn to separate the “+” and “-”. Try it out, if you don’t believe. - + Graph of XOR function - + P Q P XOR Q ----------------------------------------- 1 1 0 1 0 1 0 1 1 0 0 0 Hidden layers required!! UNIVERSITI TENAGA NASIONAL
Perceptron cannot compute XOR function (II) • Consider this net: • This suggests that neural nets of threshold units comprising more than one layer can correctly compute XOR function 0.5 1 -2 1 1 1 1.5 UNIVERSITI TENAGA NASIONAL
Perceptron cannot compute XOR function (III) • Hidden unit is neither an input nor an output unit, thus we need not concern with its activation level • Any function a perceptron can compute, a perceptron canlearn UNIVERSITI TENAGA NASIONAL
Description of A Learning Task • Rules: • to teach a perceptron a function f which maps n binary valuesx1, x2, … xn to a binary output f(x1, x2, … xn ). • Think of f being the AND function • { f(1,1)=1, f(1,0)=0, f(0,1)=0, f(0,0)=0} • Starting off with random weights & thresholds and inputs & output will have some values that responds to activation level a, either 1 or 0. UNIVERSITI TENAGA NASIONAL
We then compare the actual output with the desired output f(x1, x2, … xn ) = t • ‘t’ for teaching • If the two are the same then leave the weights/threshold alone UNIVERSITI TENAGA NASIONAL
Perceptron Learning Algorithm UNIVERSITI TENAGA NASIONAL
Set wi( i = 1, 2, .., n) and to be real numbers • Set to be a positive real number • UNTIL all ap = tpfor each input pattern p DO • FOR each input pattern p = (x1p … xnp) DO • let new weights & threshold be: • wi wi + (tp - ap) . xip • - . (tp - ap) • ENDFOR • END UNTIL UNIVERSITI TENAGA NASIONAL
Few words on • This is learning rate • Amount by which we adjust wi & for each pattern P. • It affects the Speed of learning • fairly small positive number is suggested • if it is too big --> over step minima • if it is too small --> move very2 slow UNIVERSITI TENAGA NASIONAL
x x x Minima is here & being skipped Too slow!!! Crawling …. UNIVERSITI TENAGA NASIONAL
Multi-layer Neural Networks (MLP) • Hidden layers are required… • What are hidden layers? • They are layers additional to the input and output layers, • not connected externally. • They are located in between the input and output layers. UNIVERSITI TENAGA NASIONAL
Multi-layer Perceptron (MLP) • To build nonlinear classifier based on Perceptrons • Structure of MLP is usually found by experimentation • Parameters can be found using backpropagation UNIVERSITI TENAGA NASIONAL
Multi-layer Perceptron (MLP) • How to learn? • Cannot simply use Perceptron learning rule because we hidden layer(s) • There is a function that we are trying to minimize: e r r o r • Need a different activation function: • Use sigmoid function instead of threshold function UNIVERSITI TENAGA NASIONAL
Formulas needed for The backpropagation learning algorithm UNIVERSITI TENAGA NASIONAL
Multi-layer Neural Networks • Modifications done to “units” • We still assume input values are either 1 or 0 • Output values are either 1 or 0 • But, activation levels take on any real number between 0 and 1 • Thus, • the activation level of each unit xj is: first we take the net input to xj to be weighted sum using this formula • netj = ( wji . xi) - j) ------ (2) • i UNIVERSITI TENAGA NASIONAL
Here, • summation runs over all input units xi in the previous layer to xj • with wji denoting the weight on the link from xi to unit xj • j the threshold corresponding to xj Step function required and we use SIGMOID function UNIVERSITI TENAGA NASIONAL
Sigmoid Function • Is a continuous function • Also called smooth function • Why is this f(x) needed? • It is a mathematical function that produces a sigmoid curve (i.e. S shape). It is a special case of a logistic function. It is used in neural network to introduce non linearity in the learning model. 1 f(netj) = 1 + e (- wji . xi + j) / T --- Sigmoid f(x) Run over all i UNIVERSITI TENAGA NASIONAL
Learning in Multi-layer NN via the ‘Backpropagation’ learning algorithm • All input patterns P are fed one at a time into the input units • actual response of the output units are compared with the desired output • adjustments are made to the weights in response to discrepancies between the desired & actual outputs • after all input patterns have been given, the whole process is repeated over & over until the actual response of the output is tolerably close to the desired response UNIVERSITI TENAGA NASIONAL
We now examine the procedure of adjusting weights: • jp = (tj - aj) -------- (3) • where • jp = error at unit j in respond to presentation of input pattern P • tj = desired response • aj = actual response For an output unit, j UNIVERSITI TENAGA NASIONAL
The weights leading to unit j are modified in much the same way as for single-layer perceptron • For all units k which feed into unit j, we set: w j,k w j,k + akp.jp. f’(netjp) -------- (4) f’(netjp) = rate of change of function at any point, i.e. derivative of a function UNIVERSITI TENAGA NASIONAL
What if unit j is a hidden unit? • The measure of jp of error at unit j, cannot this time be given by the difference (tj - aj)[recall formula (3)] • Because we do not know what the response of the hidden units should be!! • Instead, it is calculated on the basis of the errors of the units in the layer immediately above unit j UNIVERSITI TENAGA NASIONAL
Specifically, the error at unit j is the weighted sum of ALL the errors at the units k such that there is a link from unit j to unit k, with the weighting simply being given by the weights on the links: jp= w k,j . kp ------ (5) k UNIVERSITI TENAGA NASIONAL
Equation (3) tells us how to calculate error for output units and equation (5) tells us how to calculate errors for hidden units in terms of the errors in the layer above We can construct a “goodness-of-fit” measure, which is used to determine how close the network is to compute the function we are trying to teach it. A (sensible) measure is: E = E p Where E p = ( (tjp - ojp)2) UNIVERSITI TENAGA NASIONAL
ANN Promises • A successful implementation area of ANN is “vision”. • NN can survive the failure of some nodes • Handle noise (missing data) well. Once trained, NN shows an ability to recognize patterns even though part of the data is missing • A tool for modeling and exploring brain function • Parallelism (without much effort) • A neural network can execute an automatic acquisition task for situation in which historical data are available. UNIVERSITI TENAGA NASIONAL
ANN unsolved problems • It can not (now) model high-level cognitive mechanism such as attention • Brains are very large, having trillions of neurons • There is growing evidence that (human) neuron can learn by not merely adjusting weights but to grow new connections UNIVERSITI TENAGA NASIONAL