COMP/EECE 7/8740 Neural Networks

COMP/EECE 7/8740 Neural Networks Mapping Properties of Multi-Layer Perceptrons MLP February 20, 2001

Basic NN Architectures • Feed-forward NN • Directed graph in which a path never visits the same node twice • Relatively simple behavior • Example: MLP for classification, pattern recognition • Feedback or Recurrent NNs • Contains loops of directed edges going forward and also backward • Complicated oscillations might occur • Example: Hopfield NN, Elman NN for speech recogn. • Random NNs • More realistic, very complex

Mapping Arbitrary Boolean Function • Input vector: • length d, all components 0 or 1 • Output: • 1: if the given input is class A, 0: if input from class B • Total 2^d inputs; say K are in class A, 2^d - K in B • 2 layers of FF NN • Input size 2^d; Hidden size K; Output size 1; hardlim threshold funct. • Weights • Input -> Hidden: 1 if given input is in A and has 1at the node; -1 otherw • Hidden -> Out: all 1; Bias/hidden: 1-b:if node K has b ones; • Prove!: this NN gives 1 if input from A, 0 from B.

Mapping Arbitrary Function with 3-layer FFNN • Single neuron threshold -> half-space • 2 layer NN -> convex region • Output bias: -M(hidden u.) gives logical AND • 3 layer NN -> any region !!! • Subdivide input into approx. hypercubes • A cluster of d 1st hidden nodes maps one cube • Bias -1 means logical OR at output • Can produce any combination of input cubes

Mapping with 2-layer FFNNs • 2 layer FFNN with threshold only: • Cannot map arbitrary function • 2 layer FFNN with sigmoid (!): • --> can approximate arbitrary continuous function • CONSEQUENCE: • 2 layer FFNN w/sigmoid: universal discriminant

Kolmogorov Approximation Theorem (1957) • Discovered independently of NNs • Related to Hilbert’s 23 unsolved problems/1900 • #13: Can a function of several variables be represented as a combination of functions of a few variable. Arnold: yes - 3 variable with 2! • Kolmogorov AN: • Any multivariable continuous function can be expressed as superposition of functions of one variable (a small number of components) • Limitations: not constructive, too complicated

Learning with Error Backpropagation (BP) • Learning: • determine the weights of the NN • Assume: • Structure is given • Transfer functions are given • Input - output pair are given • Supervised learning based on examples! • See: derivation of backpropagation

COMP/EECE 7/8740 Neural Networks