520 likes | 536 Views
This article discusses the concept of learning in artificial neural networks, the basis of neural networks, and the process of weight initialization and update. It also explores how biases and thresholds affect the network's performance.
E N D
Outline • Concept of Learning • Basis of Artificial Neural Network • Neural Network with Supervised Learning (Single Layer Net) • Hebb • Perceptron • Modelling Simple Problem
CONCEPT OF LEARNING Nooraini Yusoff Computer Science Department Faculty of Information Technology Universiti Utara Malaysia
Learning and Weights..? Think… Can you identify which is “the Simpsons”.
Learning and Weights..? “the Simpsons” “the not”
Learning and Weights..? Data / Observation / Experience (past experience) new data…
Learning and Weights..? Hidden layer Input layer w1 Output layer v1 v2 w2 AGE HEART ATTACK GENDER SUGAR NO HEART ATTACK wk vj HYPERTENSION
Learning and Weights..? Nukleus Activation Function: (y-in) = 1 if y-in >= and (y-in) = 0 yin = x1w1 + x2w2 Synapse w1 x1 y Axon x2 w2 Dendrite • - A neuron receives input, determines the strength or the weight of the input, calculates the total weighted input, and compares the total weighted with a value (threshold) • The value is in the range of 0 and 1 • If the total weighted input greater than or equal the threshold value, the neuron will • produce the output, and if the total weighted input less than the threshold value, no output will be produced
BASIS OF ARTIFICIAL NEURAL NETWORK Nooraini Yusoff Computer Science Department Faculty of Information Technology Universiti Utara Malaysia
Basis of Artificial Neural Network (Content) • Data Preparation • Activation Functions • Biases and Threshold • Weight initialization and update • Linear Separability
Basis of Artificial Neural Network Data Preparation • In NN, inputs and outputs are to be represented numerically. • Garbage-in garbage-out principle: flawed data used in developing a network would result in a flawed network. • Unsuitable representation affects learning and could eventually turn a NN project into a failure.
Basis of Artificial Neural Network • Why preprocess the data? • Main goal – to ensure that the statistical distribution of values for each net input and output is roughly uniform. • NN will not produce accurate forecasts with incomplete, noisy and inconsistent data. • Decisions made in this phase of development are critical to the performance of the network.
Basis of Artificial Neural Network Binary [0,1] -1 -0.5 0 0.5 1 Bipolar [-1,1] • Input & Output representation • Binary vs. Bipolar
Basis of Artificial Neural Network V1 V2 V3 V4 T 0.63 0.68 0.21 0.04 1 0 0 0.56 0.68 0.16 0.04 1 0 0 0.76 0.9 0.18 0.08 1 0 0 0.75 1 0.28 0.16 1 0 0 0.71 0.88 0.24 0.16 1 0 0 0.67 0.79 0.21 0.12 1 0 0 0.68 0.61 0.59 0.56 0 1 0 0.65 0.45 0.53 0.4 0 1 0 0.77 0.68 0.63 0.6 0 1 0 0.78 0.5 0.6 0.4 0 1 0 0.8 0.65 0.71 0.56 0 1 0 0.73 0.65 0.54 0.52 0 1 0 1 0.68 1 0.84 0 0 1 0.64 0.48 0.68 0.68 0 0 1 0.96 0.52 0.95 0.72 0 0 1 0.88 0.56 0.87 0.72 0 0 1 0.94 0.81 0.92 1 0 0 1 0.85 0.72 0.77 0.8 0 0 1 • Example Binary Representation 1 0
Basis of Artificial Neural Network V1 V2 V3 V4 T 0.63 0.68 0.21 0.04 1 -1 -1 0.56 0.68 0.16 0.04 1 -1 -1 0.76 0.9 0.18 0.08 1 -1 -1 0.75 1 0.28 0.16 1 -1 -1 0.71 0.88 0.24 0.16 1 -1 -1 0.67 0.79 0.21 0.12 1 -1 -1 0.68 0.61 0.59 0.56 -1 1 -1 0.65 0.45 0.53 0.4 -1 1 -1 0.77 0.68 0.63 0.6 -1 1 -1 0.78 0.5 0.6 0.4 -1 1 -1 0.8 0.65 0.71 0.56 -1 1 -1 0.73 0.65 0.54 0.52 -1 1 -1 1 0.68 1 0.84 -1 -1 1 0.64 0.48 0.68 0.68 -1 -1 1 0.96 0.52 0.95 0.72 -1 -1 1 0.88 0.56 0.87 0.72 -1 -1 1 0.94 0.81 0.92 1 -1 -1 1 0.85 0.72 0.77 0.8 -1 -1 1 • Example Bipolar Representation 0 -1
Basis of Artificial Neural Network Binary Sigmoid Identity Function Bipolar Sigmoid Binary Step Function Common Activation Function
Basis of Artificial Neural Network Bias and Thresholds • A bias acts exactly as a weight on a connections from a unit whose activation is always 1. • The weight of the bias is trainable just like any other weight. • Increasing the bias increases the net input to the unit. • If a bias is included, the activation function is typically taken to be:
Basis of Artificial Neural Network x1 v11 v12 v21 w11 x2 z1 y1 v22 w12 w21 v31 w22 v32 x3 z2 y2 v41 v42 x4 Input layer Hidden layer Output layer Bias 1 1 v01 w01 v02 w02
Basis of Artificial Neural Network å = net x w i i å i = + net b x w i i i • Why use bias?...to increase value of yin With bias Without bias What happened if all x are 0? Learning??? Yes There is A response Learning??? No There is NO response
Basis of Artificial Neural Network Weight Initialization and Update • You may set the initial weights with any values. • The choice of initial weights will influence whether the net reaches a global (or only a local) minimum of the error and how quickly it converges. • Important to avoid choices of initial weights that would make it likely that either activations or its derivatives are zero. • Must not be too large • the initial input signals to each hidden or output unit will be likely to fall in the region where the derivative of the sigmoid func. has a very small value – saturation region.
Basis of Artificial Neural Network • Must not be too small • The net input to a hidden or output unit will be close to zero – cause extremely slow learning. • Methods Generating Random Weight Nguyen-Widrow Weights Weights
Basis of Artificial Neural Network • Random Initialization • A common procedure is to initialize the weights (and biases) to random values between any suitable interval. • Such as –0.5 and 0.5 or –1 and 1. • The values may be +ve or –ve because the final weights after training may be of either sign also.
Basis of Artificial Neural Network nnumber of input units pnumber of hidden units scale factor: • Nguyen-Widrow Initialization • For each hidden unit (j = 1, …, p): • Initialize its weight vector (from the input units): • vij(old) = random num. between –0.5 and 0.5 • Compute || vj(old) ||
Basis of Artificial Neural Network • Reinitialize weights: • Set bias:
Basis of Artificial Neural Network + positive region - negative region Linear Separability • The problem is “linear separable” • If there are weights (and a bias) so that all of the training input vectors for which the correct response is +1 lie on one side of the decision boundary and all of the training input vectors for which the correct response is –1 lie on the other side of the decision boundary.
Basis of Artificial Neural Network • Problem: AND, OR, XOR 1 0 1 1 0 1 0 0 1 0 1 0 AND OR XOR
NEURAL NETWORK WITH SUPERVISED LEARNING(Single Layer Net) Nooraini Yusoff Computer Science Department Faculty of Information Technology Universiti Utara Malaysia
Modelling a Simple Problem • Should I attend this lecture? • x1 = weather ( hot or raining) • x2 = day (weekday or weekend) 1 ? y ? x1 ? x2
Example: 2 input AND (bipolar) binary bipolar
Hebb’s Rule • 1949. Increase the weight between two neurons that are both “on”. • 1988. Increase the weight between two neurons that are both “off”. • wi(new) = wi(old) + xi*y
Hebb’s Algorithm • Set initial weights: wi = 0 for 0 <= i <= n 2. for each training vector 3. set xi = si for all input units 4. set y = t 5. wi(new) = wi(old) + xi*y
Training Procedure Initial weights: w0 = 0,w1 = 0, w2 = 0
Result Interpretation • -2 + 2x1 + 2x2 = 0 OR • x2 = -x1 + 1 • This training procedure is order dependent and not guaranteed.
Perceptrons (1958) • Very important early neural network • Guaranteed training procedure under certain circumstances 1 w0 y w1 x1 wn xn
Activation Function • (yin) = 1 if yin > q(yin) = 0 if - q <= yin <= q (yin) = -1 otherwise
Learning Rule • wi(new) = wi(old) + a*t*xi if error • a is the learning rate • Typically, 0 < a <= 1
Perceptron’s Algorithm 1.Set initial weights: wi = 0 for 0 <= i <= n (can be random) 2. for each training exemplar do 3. xi = si 4. yin = xi*wi 5. y = f(yin) 6. wi(new) = wi(old) + a *t*xiif error 7. if stopping condition not reached, go to 2 f(yin) = 1 if yin > f(yin) = 0 if - <= yin <= f(yin) = -1 otherwise
Example: AND concept • bipolar inputs • bipolar target • q = 0 • a = 1
Training Procedure - Epoch 1 Initial weights: w0 = 0,w1 = 0, w2 = 0
Exercise • Continue the above example until the learning is finished.
Perceptron Learning Rule Convergence Theorem • If a weight vector exists that correctly classifies all of the training examples, then the perceptron learning rule will converge to some weight vector that gives the correct response for all training patterns. This will happen in a finite number of steps.
Comparison between Hebb and Perceptron y = f(yin) y = f(yin)
Exercise Prepare a Perceptron learning table for epoch 1 and epoch 2 for problem of AND logic with bias using following learning requirements: - bipolar input and target - learning rate, () = 1 - threshold, () = 0.2.
Single Layer Net(Problem Analysis) Nooraini Yusoff Universiti Utara Malaysia
Perceptron x0 1 w0 w1 x1 y wn xn
Problem Description : To predict whether an application of a student to stay in college is accepted, KIV or rejected Problem: Classification
Representation of Data Male ( 1 ) Gender Female ( -1 ) ≥ 3.00 ( 1 ) CGPA 3.00 ( -1 ) Accepted ( 1 ) Data for learning KIV ( 0 ) Result Rejected ( -1 )