430 likes | 446 Views
Activations, attractors, and associators. Jaap Murre Universiteit van Amsterdam jaap@murre.com. Overview. Interactive activation model Hopfield networks Constraint satisfaction Attractors Traveling salesman problem Hebb rule and Hopfield networks Bidirectional associative networks
E N D
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam jaap@murre.com
Overview • Interactive activation model • Hopfield networks • Constraint satisfaction • Attractors • Traveling salesman problem • Hebb rule and Hopfield networks • Bidirectional associative networks • Linear associative networks
The final interpretation must satisfy many constraints In the recognition of letters and words: i. Only one word can occur at a given position ii. Only one letter can occur at a given position iii. A letter-on-a-position activates a word iv. A feature-on-a-position activates a letter
Letter detection • Press the left button when the letter L is present • Press as fast as you can
L.. C.. .A. ..P ..B i. Only one word can occur at a given position LAP CAP CAB
ii. Only one letter can occur at a given position LAP CAP CAB L.. C.. .A. ..P ..B
iii. A letter-on-a-position activates a word LAP CAP CAB L.. C.. .A. ..P ..B
LAP CAP CAB L.. C.. .A. ..P ..B iv. A feature-on-a-position activates a letter
Recognition of a letter is a process of constraint satisfaction LAP CAP CAB L.. C.. .A. ..P ..B
Recognition of a letter is a process of constraint satisfaction LAP CAP CAB L.. C.. .A. ..P ..B
Recognition of a letter is a process of constraint satisfaction LAP CAP CAB L.. C.. .A. ..P ..B
Recognition of a letter is a process of constraint satisfaction LAP CAP CAB L.. C.. .A. ..P ..B
Recognition of a letter is a process of constraint satisfaction LAP CAP CAB L.. C.. .A. ..P ..B
Hopfield (1982) • Bipolar activations • -1 or 1 • Symmetric weights (no self weights) • wij= wji • Asynchronous update rule • Select one neuron randomly and update it • Simple threshold rule for updating
Energy of a Hopfield network Energy E = - ½i,jwjiaiaj E = - ½i(wjiai+ wijai)aj = - iwjiai aj Net input to node j is iwjiai = netj Thus, we can write E = - netj aj
Given a net input, netj, find aj so that -netjaj is minimized • If netj is positive set aj to 1 • If netj is negative set aj to -1 • If netj is zero, don’t care (leave aj as is) • This activation rule ensures that the energy never increases • Hence, eventually the energy will reach a minimum value
Attractor • An attractor is a stationary network state (configuration of activation values) • This is a state where it is not possible to minimize the energy any further by just flipping one activation value • It may be possible to reach a deeper attractor by flipping many nodes at once • Conclusion: The Hopfield rule does not guarantee that an absolute energy minimum will be reached
Attractor Local minimum Global minimum
Example: 8-Queens problem • Place 8 queens on a chess board such that they are not able to take each other • This implies the following three constraints: • 1 queen per column • 1 queen per row • 1 queen on any diagonal • This encoding of the constraints ensures that the attractors of the network correspond to valid solutions
The constraints are satisfied by inhibitory connections Column Diagonals Row Diagonals
Problem: how to ensure that exactly 8 nodes are 1? • A term may be added to control for this in the activation rule • Binary nodes may be used with a bias • It is also possible to use continuous valid nodes with Hopfield networks (e.g, between 0 and 1)
The energy minimization question can also be turned around • Given ai and aj, how should we set the weight wji = wji so that the energy is minimized? • E = - ½ wjiaiaj, so that • when aiaj = 1, wji must be positive • when aiaj = -1, wji must be negative • For example, wji= aiaj, where is a learning constant
Hebb and Hopfield • When used with Hopfield type activation rules, the Hebb learning rule places patterns at attractors • If a network has n nodes, 0.15n random patterns can be reliably stored by such a system • For complete retrieval it is typically necessary to present the network with over 90% of the original pattern
Bidirectional Associative Memories (BAM, Kosko 1988) • Uses binary nodes (0 or 1) • Symmetric weights • Input and output layer • Layers are updated in order, using threshold activation rule • Nodes within a layer are updated synchronously
BAM • BAM is in fact a Hopfield network with two layers of nodes • Within a layer, weights are 0 • These neurons are not dependent on each other (no mutual inputs) • If updated synchronously, there is therefore no danger of increasing the network energy • BAM is similar to the core of Grossberg’s Adaptive Resonance Theory (Lecture 4)
Linear Associative Networks • Invented by Kohonen (1972), Nakano (1972), and by Anderson (1972) • Two layers • Linear activation rule • Activation is equal to net input • Can store patterns • Their behavior is mathematically tractable using matrix algebra
Associating an input vector p with an output vector q Storage: W = qpT with = (pTp)-1 Recall: Wp = qpTp = pTpq = q
Inner product pTp gives a scalar pT 3 0 1 4 0 1 3 0 1 4 0 1 9 0 1 16 0 1 9 0 1 16 0 1 p = (pTp)-1 = 1/27 27
Outer product qpT gives a matrix pT input vector 3 0 1 4 0 1 3 0 1 4 0 1 6 0 2 8 0 2 0 0 0 0 0 0 6 0 2 8 0 2 12 0 4 16 0 4 3 0 1 4 0 1 1 2 0 2 4 1 q output vector W weight matrix multiplied by learning constant
Recall: Wp = q Input vector Output vector Weight matrix 0.113 + 00 + 0.04 1 + 0.154 + 0 0 + 0.041 = 1 0.223 + 00 + 0.07 1 + 0.304 + 0 0 + 0.071 = 2
Storing n patterns Storage: Wk = kqkpkT, with k =pkTpk W = W1 + W2 + … + Wk + … + Wn Recall: Wpk = kqkpkTpk + Error = q + Error Error = W1pk + … + Whpk + … + Wnpk is 0 only if phTpk= 0 for all h k
Characteristics of LANs • LANs work only well, if the input patterns are (nearly) orthogonal • If an input pattern overlaps with others, then recall will be contaminated with the output patterns of those overlapping patterns • It is, therefore, important that input patterns are orthogonal (i.e., have little overlap)
r LANs have limited representational power V q • For each three-layer LAN, there exists an equivalent two layer LAN • Proof: Suppose that q = Wp and r = Vq, than we have r = Vq = VWp = Xp with X = VW W p r X p
Summing up • There is a wide variety of ways to store and retrieve patterns in neural networks based on the Hebb rule • Willshaw network (associator) • BAM • LAN • Hopfield network • In Hopfield networks, stored patterns can be viewed as attractors
Summing up • Finding an attractor is a process of constraint satisfaction. It can can be used as: • A recognition model • A memory retrieval model • A way of solving the traveling salesman problem and other difficult problems