190 likes | 328 Views
Introduction to Neural Networks. John Paxton Montana State University Summer 2003. Chapter 5: Adaptive Resonance Theory. 1987, Carpenter and Grossberg ART1: clusters binary vectors ART2: clusters continuous vectors. General.
E N D
Introduction to Neural Networks John Paxton Montana State University Summer 2003
Chapter 5: Adaptive Resonance Theory • 1987, Carpenter and Grossberg • ART1: clusters binary vectors • ART2: clusters continuous vectors
General • Weights on a cluster unit can be considered to be a prototype pattern • Relative similarity is used instead of an absolute difference. Thus, a difference of 1 in a vector with only a few non-zero components becomes more significant.
General • Training examples may be presented several times. • Training examples may be presented in any order. • An example might change clusters. • Nets are stable (patterns don’t oscillate). • Nets are plastic (examples can be added).
Architecture • Input layer (xi) • Output layer or cluster layer – competitive (yi) • Units in the output layer can be active, inactive, or inhibited.
Sample Network • t (top down weight), b (bottom up weight) t11 x1 y1 xn ym bnm
Nomenclature • bij: bottom up weight • tij: top down weight • s: input vector • x: activation vector • n: number of components in input vector • m: maximum number of clusters • || x ||: S xi • p: vigilance parameter
Training Algorithm 1. L > 1, 0 < p <= 1 tji(0) = 1 0 < bij(0) < L / (L – 1 + n) 2. while stopping criterion is false do steps 3 – 12 3. for each training example do steps 4 - 12
Training Algorithm 4. yi = 0 5. compute || s || 6. xi = si 7. if yj (do for each j) is not inhibited then yj = S bij xi 8. find largest yj that is not inhibited 9. xi = si * tji
Training Algorithm 10. compute || x || 11. if || x || / || s || < p then yj = -1, go to step 8 12. bij = L xi / ( L – 1 + || x || ) tji = xi
Possible Stopping Criterion • No weight changes. • Maximum number of epochs reached.
What Happens If All Units Are Inhibited? • Lower p. • Add a cluster unit. • Throw out the current input as an outlier.
Example x1 • n = 4 • m = 3 • p = 0.4 (low vigilance) • L = 2 • bij(0) = 1/(1 + n) = 0.2 • tji(0) = 1 y1 x2 y2 x3 y3 x4
Example 3. input vector (1 1 0 0) 4. yi = 0 5. || s || = 2 6. x = (1 1 0 0) 7. y1 = .2(1) + .2(1) + .2(0) + .2(0) = 0.4 y2 = y3 = y4 = 0.4
Example 8. j = 1 (use lowest index to break ties) 9. x1 = s1 * t11 = 1 * 1 = 1 x2 = s2 * t12 = 1 * 1 = 1 x3 = s3 * t13 = 0 * 1 = 0 x4 = s4 * t14 = 0 * 1 = 0 10. || x || = 2 11. || x || / || s || = 1 >= 0.4
Example 12. b11 = 2 * xi / (2 - 1 + || x ||) = 2 * 1 / (1 + 2) = .667 b21 = .667 b31 = b41 = 0 t11 = x1 = 1 t12 = 1 t13 = t14 = 0
Exercise • Show the network after the training example (0 0 0 1) is processed.
Observations • Typically, stable weight matrices are obtained quickly. • The cluster units are all topologically independent of one another. • We have just looked at the fast learning version of ART1. There is also a slow learning version that updates just one weight per training example.