440 likes | 742 Views
Machine Learning. Artificial Neural Networks. The Brain. How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence on the structure and inner workings of the brain?. The Brain. The human brain consists of:
E N D
Machine Learning Artificial Neural Networks
The Brain • How do brains work? • How do human brains differ from thatof other animals? • Can we base models ofartificial intelligence onthe structure and innerworkings of the brain?
The Brain • The human brain consists of: • Approximately 10 billion neurons • …and 60 trillion connections • The brain is a highly complex, nonlinear,parallel information-processing system • By firing neurons simultaneously, the brain performs faster than the fastest computers in existence today
The human brain consists of: • Approximately 10 billion neurons • …and 60 trillion connections (synapses)
An individual neuron has a very simple structure • Cell body is called a soma • Small connective fibers are called dendrites • Single long fibers are called axons • An army of such elements constitutes tremendous processing power
Artificial Neural Networks • An artificial neural network consists of a numberof very simple processors called neurons • Neurons are connected by weighted links • The links pass signals from one neuron to another based on predefined thresholds
Artificial Neural Networks • An individual neuron (McCulloch & Pitts, 1943): • Computes the weighted sum of the input signals • Compares the result with a threshold value, q • If the net input is less than the threshold,the neuron output is –1 (or 0) • Otherwise, the neuron becomes activatedand its output is +1
threshold Artificial Neural Networks Q X = x1w1 + x2w2 + ... + xnwn
Activation Functions • Individual neurons adhere to an activation function, which determines whether they propagate their signal (i.e. activate) or not: Sign Function
Activation Functions • The step, sign, and sigmoid activation functionsare also often called hard limit functions • We use such functions in decision-making neural networks • Support classification and other pattern recognition tasks
Perceptrons • Can an individual neuron learn? • In 1958, Frank Rosenblatt introduced a training algorithm that provided the first procedure for training asingle-node neural network • Rosenblatt’s perceptron model consists of a single neuron with adjustable synaptic weights, followed by a hard limiter
Perceptrons X = x1w1 + x2w2 Y = Ystep
Perceptrons • A perceptron: • Classifies inputs x1, x2, ..., xninto one of two distinctclasses A1 and A2 • Forms a linearly separablefunction defined by:
Perceptrons • Perceptron with threeinputs x1, x2, and x3classifies its inputsinto two distinctsets A1 and A2
Perceptrons • How does a perceptron learn? • A perceptron has initial (often random) weights typically in the range [-0.5, 0.5] • Apply an established training dataset • Calculate the error asexpected output minus actual output: errore= Yexpected – Yactual • Adjust the weights to reduce the error
Perceptrons • How do we adjust a perceptron’s weights to produce Yexpected? • If e is positive, we need to increase Yactual(and vice versa) • Use this formula: , where and • α is the learning rate (between 0 and 1) • e is the calculated error
Use threshold Θ = 0.2 andlearning rate α = 0.1 Perceptron Example – AND • Train a perceptron to recognize logical AND
Use threshold Θ = 0.2 andlearning rate α = 0.1 Perceptron Example – AND • Train a perceptron to recognize logical AND
Use threshold Θ = 0.2 andlearning rate α = 0.1 Perceptron Example – AND • Repeat until convergence • i.e. final weights do not change and no error
Perceptron Example – AND • Two-dimensional plotof logical AND operation: • A single perceptron canbe trained to recognizeany linear separable function • Can we train a perceptron torecognize logical OR? • How about logical exclusive-OR (i.e. XOR)?
Perceptron – OR and XOR • Two-dimensional plots of logical OR and XOR:
Perceptron Coding Exercise • Write a code to: • Calculate the error at each step • Modify weights, if necessary • i.e. if error is non-zero • Loop until allerror values are zero for a full epoch • Modify your code to learn to recognize the logical OR operation • Try to recognize the XOR operation....
Multilayer Neural Networks • Multilayer neural networks consist of: • An input layer of source neurons • One or more hidden layers ofcomputational neurons • An output layer of morecomputational neurons • Input signals are propagated in alayer-by-layer feedforward manner
I n p u t S i g n a l s O u t p u t S i g n a l s Multilayer Neural Networks
I n p u t S i g n a l s O u tp u t S i g n a l s Multilayer Neural Networks
XOUTPUT = yH1w11 + yH2w21 + ... + yHjwj1 + ... + yHmwm1 Multilayer Neural Networks XINPUT = x1 XH = x1w11 + x2w21 + ... + xiwi1 + ... + xnwn1
w14 Multilayer Neural Networks • Three-layer network:
Multilayer Neural Networks • Commercial-quality neural networks often incorporate 4 or more layers • Each layer consists of about 10-1000 individual neurons • Experimental and research-based neural networks often use 5 or 6 (or more) layers • Overall, millions of individual neurons may be used
Back-Propagation NNs • A back-propagation neural network is a multilayer neural network that propagates error backwards through the network as it learns • Weights are modified based on the calculated error • Training is complete when the error is below a specified threshold • e.g. less than 0.001
w14 Use the sigmoid activation function; andapply Θ by connecting fixed input -1 to weight Θ Back-Propagation NNs Initially: w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = -1.2, w45 = 1.1, q3 = 0.8, q4 = -0.1 and q5 = 0.3.
Step 2: Activation Activate the back-propagation neural network by applying inputs x1(p), x2(p),…, xn(p) and desired outputs yd,1(p), yd,2(p),…, yd,n(p). (a) Calculate the actual outputs of the neurons in the hidden layer: where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function.
Step 2: Activation (continued) (b) Calculate the actual outputs of the neurons in the output layer: where m is the number of inputs of neuron k in the output layer.
We consider a training set where inputs x1 and x2 are equal to 1 and desired output yd,5 is 0. The actual outputs of neurons 3 and 4 in the hidden layer are calculated as • Now the actual output of neuron 5 in the output layer is determined as: • Thus, the following error is obtained:
Step 3: Weight training Update the weights in the back-propagation network propagating backward the errors associated with output neurons. (a) Calculate the error gradient for the neurons in the output layer: where Calculate the weight corrections: Update the weights at the output neurons: Intelligent Systems and Soft Computing
Step 3: Weight training (continued) (b) Calculate the error gradient for the neurons in the hidden layer: Calculate the weight corrections: Update the weights at the hidden neurons: Intelligent Systems and Soft Computing
The next step is weight training. To update the weights and threshold levels in our network, we propagate the error, e, from the output layer backward to the input layer. • First, we calculate the error gradient for neuron 5 in the output layer: • Then we determine the weight corrections assuming that the learning rate parameter, a, is equal to 0.1: Intelligent Systems and Soft Computing
Next we calculate the error gradients for neurons 3 and 4 in the hidden layer: • We then determine the weight corrections: Intelligent Systems and Soft Computing
At last, we update all weights and threshold: • The training process is repeated until the sum ofsquared errors is less than 0.001. Intelligent Systems and Soft Computing