1 / 71

IV. Neural Network Learning

IV. Neural Network Learning. A. Neural Network Learning. Supervised Learning. Produce desired outputs for training inputs Generalize reasonably & appropriately to other inputs Good example: pattern recognition Feedforward multilayer networks. input layer. output layer. hidden layers.

awhiteman
Download Presentation

IV. Neural Network Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IV. Neural Network Learning

  2. A.Neural Network Learning

  3. Supervised Learning • Produce desired outputs for training inputs • Generalize reasonably & appropriately to other inputs • Good example: pattern recognition • Feedforward multilayer networks

  4. input layer output layer hidden layers Feedforward Network . . . . . . . . . . . . . . . . . .

  5. connectionweights inputs Typical Artificial Neuron output threshold

  6. linearcombination activationfunction net input(local field) Typical Artificial Neuron

  7. Net input: Neuron output: Equations

  8. Single-Layer Perceptron . . . . . .

  9. Variables x1 w1 h S Q y xj wj wn xn q

  10. Single Layer Perceptron Equations

  11. w2 x f w1 v 2D Weight Vector – + w

  12. N-Dimensional Weight Vector + normal vector w separating hyperplane –

  13. Goal of Perceptron Learning • Suppose we have training patterns x1, x2, …, xP with corresponding desired outputs y1, y2, …, yP • where xp {0, 1}n, yp  {0, 1} • We want to find w, q such thatyp = Q(wxp – q) for p = 1, …, P

  14. Treating Threshold as Weight x1 w1 h S Q y xj wj wn xn q

  15. x0 = = w0 Treating Threshold as Weight –1 x1 q w1 h S Q y xj wj wn xn

  16. Augmented Vectors

  17. Reformulation as Positive Examples

  18. Adjustment of Weight Vector z5 z1 z10 z9 z11 z8 z6 z2 z3 z4 z7

  19. Outline ofPerceptron Learning Algorithm • initialize weight vector randomly • until all patterns classified correctly, do: • for p = 1, …, P do: • if zp classified correctly, do nothing • else adjust weight vector to be closer to correct classification

  20. Weight Adjustment

  21. Improvement in Performance

  22. Perceptron Learning Theorem • If there is a set of weights that will solve the problem, • then the PLA will eventually find it • (for a sufficiently small learning rate) • Note: only applies if positive & negative examples are linearly separable

  23. NetLogo Simulation of Perceptron Learning Run Perceptron-Geometry.nlogo

  24. Classification Power of Multilayer Perceptrons • Perceptrons can function as logic gates • Therefore MLP can form intersections, unions, differences of linearly-separable regions • Classes can be arbitrary hyperpolyhedra • Minsky & Papert criticism of perceptrons • No one succeeded in developing a MLP learning algorithm

  25. Hyperpolyhedral Classes

  26. . . . input layer output layer hidden layers Credit Assignment Problem How do we adjust the weights of the hidden layers? Desired output . . . . . . . . . . . . . . . . . .

  27. NetLogo Demonstration ofBack-Propagation Learning Run Artificial Neural Net.nlogo

  28. Control Algorithm C Adaptive System Evaluation Function (Fitness, Figure of Merit) System S F … … P1 Pk Pm Control Parameters

  29. Gradient

  30. F gradient ascent Gradient Ascenton Fitness Surface + –

  31. Gradient Ascentby Discrete Steps F + –

  32. Gradient Ascent is LocalBut Not Shortest + –

  33. Gradient Ascent Process Therefore gradient ascent increases fitness (until reaches 0 gradient)

  34. General Ascent in Fitness

  35. General Ascenton Fitness Surface F + –

  36. Fitness as Minimum Error

  37. Gradient of Fitness

  38. Jacobian Matrix

  39. Derivative of Squared Euclidean Distance

  40. Gradient of Error on qth Input

  41. Recap The Jacobian depends on the specific form of the system, in this case, a feedforward neural network

  42. Multilayer Notation WL–1 xq yq W1 WL–2 W2 s1 sL s2 sL–1

  43. Notation • L layers of neurons labeled 1, …, L • Nl neurons in layer l • sl = vector of outputs from neurons in layer l • input layer s1 = xq (the input pattern) • output layer sL = yq (the actual output) • Wl = weights between layers l and l+1 • Problem: find how outputs yiq vary with weights Wjkl (l = 1, …, L–1)

  44. Typical Neuron s1l–1 Wi1l–1 Wijl–1 hil S s sjl–1 sil WiNl–1 sNl–1

  45. Error Back-Propagation

  46. Delta Values

  47. Output-Layer Neuron s1L–1 Wi1L–1 WijL–1 hiL S s sjL–1 tiq siL = yiq WiNL–1 Eq sNL–1

  48. Output-Layer Derivatives (1)

  49. Output-Layer Derivatives (2)

  50. Hidden-Layer Neuron s1l s1l–1 s1l+1 W1il Wi1l–1 Wijl–1 Wkil hil S s sjl–1 skl+1 Eq sil WiNl–1 WNil sNl–1 sNl+1 sNl

More Related