1 / 18

Introduction to Neural Networks

Introduction to Neural Networks. John Paxton Montana State University Summer 2003. Chapter 5: Adaptive Resonance Theory. 1987, Carpenter and Grossberg ART1: clusters binary vectors ART2: clusters continuous vectors. General.

Download Presentation

Introduction to Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Neural Networks John Paxton Montana State University Summer 2003

  2. Chapter 5: Adaptive Resonance Theory • 1987, Carpenter and Grossberg • ART1: clusters binary vectors • ART2: clusters continuous vectors

  3. General • Weights on a cluster unit can be considered to be a prototype pattern • Relative similarity is used instead of an absolute difference. Thus, a difference of 1 in a vector with only a few non-zero components becomes more significant.

  4. General • Training examples may be presented several times. • Training examples may be presented in any order. • An example might change clusters. • Nets are stable (patterns don’t oscillate). • Nets are plastic (examples can be added).

  5. Architecture • Input layer (xi) • Output layer or cluster layer – competitive (yi) • Units in the output layer can be active, inactive, or inhibited.

  6. Sample Network • t (top down weight), b (bottom up weight) t11 x1 y1 xn ym bnm

  7. Nomenclature • bij: bottom up weight • tij: top down weight • s: input vector • x: activation vector • n: number of components in input vector • m: maximum number of clusters • || x ||: S xi • p: vigilance parameter

  8. Training Algorithm 1. L > 1, 0 < p <= 1 tji(0) = 1 0 < bij(0) < L / (L – 1 + n) 2. while stopping criterion is false do steps 3 – 12 3. for each training example do steps 4 - 12

  9. Training Algorithm 4. yi = 0 5. compute || s || 6. xi = si 7. if yj (do for each j) is not inhibited then yj = S bij xi 8. find largest yj that is not inhibited 9. xi = si * tji

  10. Training Algorithm 10. compute || x || 11. if || x || / || s || < p then yj = -1, go to step 8 12. bij = L xi / ( L – 1 + || x || ) tji = xi

  11. Possible Stopping Criterion • No weight changes. • Maximum number of epochs reached.

  12. What Happens If All Units Are Inhibited? • Lower p. • Add a cluster unit. • Throw out the current input as an outlier.

  13. Example x1 • n = 4 • m = 3 • p = 0.4 (low vigilance) • L = 2 • bij(0) = 1/(1 + n) = 0.2 • tji(0) = 1 y1 x2 y2 x3 y3 x4

  14. Example 3. input vector (1 1 0 0) 4. yi = 0 5. || s || = 2 6. x = (1 1 0 0) 7. y1 = .2(1) + .2(1) + .2(0) + .2(0) = 0.4 y2 = y3 = y4 = 0.4

  15. Example 8. j = 1 (use lowest index to break ties) 9. x1 = s1 * t11 = 1 * 1 = 1 x2 = s2 * t12 = 1 * 1 = 1 x3 = s3 * t13 = 0 * 1 = 0 x4 = s4 * t14 = 0 * 1 = 0 10. || x || = 2 11. || x || / || s || = 1 >= 0.4

  16. Example 12. b11 = 2 * xi / (2 - 1 + || x ||) = 2 * 1 / (1 + 2) = .667 b21 = .667 b31 = b41 = 0 t11 = x1 = 1 t12 = 1 t13 = t14 = 0

  17. Exercise • Show the network after the training example (0 0 0 1) is processed.

  18. Observations • Typically, stable weight matrices are obtained quickly. • The cluster units are all topologically independent of one another. • We have just looked at the fast learning version of ART1. There is also a slow learning version that updates just one weight per training example.

More Related