ANNs (Artificial Neural Networks)

ANNs(Artificial Neural Networks)

The perceptron

Perceptron • X is a (column) vector of inputs. • W is a (column) vector of weights. •  or w0 is the bias or threshold weight.

Perceptron Usage: • training/learning • local vs. global minima • supervised vs. unsupervised • feedforward (or testing or usage or application) • Indicate class i if output g(X)>0; not class i otherwise.

Perceptron

Perceptron The bias or threshold can be represented as simply another weight (w0) with a constant input of 1 (x0=1).

Perceptron • This is the dot product of two vectors, W and X.

Activation functions • linear • threshold • sigmoid • tanh

Relationship between sigmoid and tanh

Perceptron training/learning • Initialize weights (including thresholds, w0) to random numbers in [-0.5,…,+0.5] (uniformly distributes). • Select training input vector, X, and desired output, d. • Calculate actual output, y. • Learn. (Only perform this step when output is incorrect.) where  is in [0..1] and is the gain or learning rate.

“The proof of the perceptron learning theorem (Rosenblatt 1962) demonstrated that a perceptron could learn anything it could represent.” [3] • So what can it represent? • Any problem that is linearly separable. • Are all problems linear separable?

Consider a binary function of n binary inputs. • “A neuron with n binary inputs can have 2n different input patterns, consisting of ones and zeros. Because each input pattern can produce two different binary outputs, 1 and 0, there are 22n different functions of n variables.” [3] • How many of these are separable? Not many! For n=6, 226 = 1.8x1019 but only 5,028,134 are linearly separable. • AND and OR are linearly separable but XOR is not!

How about adding additional layers? • “Multilayer networks provide no increase in computational power over a single-layer network unless there is a nonlinear function between layers.” [3] • That’s because matrix multiplication is associative. • (XW1)W2 = X(W1W2)

“It is natural to ask if every decision can be implemented by such a three-layer network … The answer, due ultimately to Kolmogorov …, is “yes” – any continuous function from input to output can be implemented in a three-layer net, given sufficient number of hidden units, proper nonlinearities, and weights.” [1]

Multilayer networks

Note threshold nodesfrom “Usefulness of artificial neural networks to predict follow-up dietary protein intake in hemodialysis patients”http://www.nature.com/ki/journal/v66/n1/fig_tab/4494599f1.html

Backpropagation(learning in a multilayer network)

Sigmoid/logistic function(and its derivative)

References • R.O. Duda, P.E. Hart, D.G. Stork, “Pattern Classification,” John Wiley and Sons, 2001. • S.K. Rogers, M. Kabrisky, “An introduction to biological and artificial neural networks for pattern recognition,” SPIE Optical Engineering Press, 1991. • P.D. Wasserman, “Neural computing,” Van Nostrand Reinhold, 1989.

ANNs (Artificial Neural Networks)

ANNs (Artificial Neural Networks)

Presentation Transcript

Topics : What is a neuron? Input/output characteristics? Networks of neurons Resources:

Artificial Neural Networks

Ahmad Aljebaly

Artificial Neural Networks

Artificial Neural Networks Part 1/3 – Introduction Slides modified from Neural Network Design by Hagan, Demuth and Beale

Artificial Neural Networks - Introduction -

Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer

Neural Networks

Artificial Neural Networks (ANN)

Exploring Artificial Neural Networks to Discover Higgs at LHC

Introduction to Artificial Intelligence (G51IAI)

Lecture 9 Chapter 6 – Artificial Neural Networks for Data Mining

Artificial Neural Networks and Adaptive Systems

Artificial Neural Networks - Introduction -

Artificial Neural Networks

Artificial Neural Networks - Introduction -

Artificial Neural Networks - Introduction -

Artificial Neural Networks for Data Mining

Artificial Neural Networks

Artificial Neural Networks (ANNs) and the Error Backpropagation Procedure