190 likes | 292 Views
CS 4700: Foundations of Artificial Intelligence. Prof. Carla P. Gomes gomes@cs.cornell.edu Module: Neural Networks: Concepts (Reading: Chapter 20.5). Basic Concepts. A Neural Network maps a set of inputs to a set of outputs Number of inputs/outputs is variable
E N D
CS 4700:Foundations of Artificial Intelligence Prof. Carla P. Gomes gomes@cs.cornell.edu Module: Neural Networks: Concepts (Reading: Chapter 20.5)
Basic Concepts • A Neural Network maps a set of inputs to a set of outputs • Number of inputs/outputs is variable • The Network itself is composed of an • arbitrary number of nodes or units, connected • by links, with an arbitrary topology. • A link from unit i to unit j serves to propagate the activation aj to j, and it has a weight Wij. What can a neural networks do? Compute a known function / Approximate an unknown function Pattern Recognition / Signal Processing Learn to do any of the above
Different types of nodes
Input edges, • each with weights • (positive, negative, and • change over time, • learning) • Output edges, • each with weights • (positive, negative, and • change over time, • learning) An Artificial NeuronNode or Unit:A Mathematical Abstraction Artificial Neuron, Node or unit , Processing Unit i Input function(ini): weighted sum of its inputs, including fixed input a0. Output Activation function (g) applied to input function (typically non-linear). a processing element producing an output based on a function of its inputs Note: the fixed input and bias weight are conventional; some authors instead, e.g., or a0=1 and -W0i
Activation Functions • Threshold activation function a step function or threshold function • (outputs 1 when the input is positive; 0 otherwise). • (b) Sigmoid (or logistics function) activation function (key advantage: differentiable) • (c) Sign function, +1 if input is positive, otherwise -1. These functions have a threshold (either hard or soft) at zero. Changing the bias weight W0,i moves the threshold location.
ThresholdActivation Function • Input edges, • each with weights • (positive, negative, and • change over time, • learning) i threshold value associated with unit i i=0 i=t
Activation of threshold units when: Implementing Boolean Functions Units with a threshold activation function can act as logic gates; we can use these units to compute Boolean function of its inputs.
Activation of threshold units when: Boolean AND W0= 1.5 -1 w1=1 w2=1 x1 x2
Activation of threshold units when: Boolean OR w0= 0.5 -1 w1=1 w2=1 x1 x2
w0= -0.5 -1 w1= -1 Activation of threshold units when: Inverter x1 So, units with a threshold activation function can act as logic gates given the appropriate input and bias weights.
Network Structures Our focus • Acyclic or Feed-forward networks • Activation flows from input layer to • output layer • single-layer perceptrons • multi-layer perceptrons • Recurrent networks • Feed the outputs back into own inputs • Network is a dynamical system (stable state, oscillations, chaotic behavior) • Response of the network depends on initial state • Can support short-term memory • More difficult to understand Feed-forward networks implement functions, have no internal state (only weights).
Recurrent Networks • Can capture internal state (activation keeps going around); • more complex agents. • Brain cannot be a just a feed-forward network! • Brain has many feed-back connections and cycles • brain is a recurrent network! Two key examples: Hopfield networks: Boltzmann Machines .
Hopfield Networks • A Hopfield neural network is typically used for pattern recognition. • Hopfield networks have symmetric weights (Wij=Wji); • Output: 0/1 only. • Train weights to obtain associative memory • e.g., store template patterns as multiple stable states; given a new input pattern, the network converges to one of the exemplar patterns. • It can be proven that an N unit Hopfield net can learn up to 0.138N patterns reliably. • Note: no explicit storage: all in weights!
Hopfield Networks • The user trains the network with a set of black-and-white templates; • Input units: 100 pixels; • Output units: 100 pixels; • For each template, each neuron in the network (corresponding to one • pixel) learns to turn itself on or off based on the current output of every • other neuron in the network. • After training, the network can be provided with an arbitrary input pattern, • and it (may) converges to an output pattern resembling whichever • template most closely matches this input pattern http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html
Hopfield Networks Given input pattern: After around 500 iterations the network converges to: http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html
Boltzmann Machines • Generalization of Hopfield Networks: • Hidden neurons: the Boltzamnn machines have hidden units; • Neuron update: stochastic activation functions Both Hopfield and Boltzamnn networks can solve optimization problems (similar to Monte Carlo methods). We will not cover these networks.
Weights are the parameters of the function Feed-forward Network:Represents a function of Its Input Two input units Two hidden units One Output Each unit receives input only from units in the immediately preceding layer. (Bias unit omitted for simplicity) Given an input vector x = (x1,x2), the activations of the input units are set to values of the input vector, i.e., (a1,a2)=(x1,x2), and the network computes: Feed-forward network computes a parameterized family of functions hW(x) By adjusting the weights we get different functions: that is how learning is done in neural networks! Note: the input layer in general does not include computing units.
Large IBM investment in the next generation of Neural Nets • IBM plans 'brain-like' computers • Page last updated at 14:52 GMT, Friday, 21 November 2008 • By Jason Palmer Science and technology reporter, BBC News • IBM has announced it will lead a US • government-funded collaboration to • make electronic • circuits that mimic brains. http://news.bbc.co.uk/2/hi/science/nature/7740484.stm