1 / 43

Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons

Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons. MVN and MLMVN. f (-1,1)= -1. f (1,1)= 1. 1. -1. 1. -1. f (1,-1)= -1. f (-1,-1)= - 1. A threshold function is a linearly separable function.

Download Presentation

Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons MVN and MLMVN

  2. f(-1,1)= -1 f(1,1)=1 1 -1 1 -1 f(1,-1)= -1 f(-1,-1)= -1 A threshold function is a linearly separable function Linear separability means that it is possible to separate “1”s and “-1”s by a hyperplane f (x1, x2)is the OR function

  3. Threshold Boolean Functions • The threshold (linearly separable) function can be learned by a single neuron • The number of threshold functions is very small in comparison to the number of all functions (104 of 256 for n=3, about 2000 of 65536 for n=4, etc.) • Non-threshold (nonlinearly separable) functions can not be learned by a single neuron (Minsky-Papert, 1969), they can be learned only by a neural network

  4. f(-1,1)= -1 f(1,1)=1 1 -1 1 -1 f(1,-1)= -1 f(-1,-1)=1 XOR – a classical non-threshold (non-linearly separable) function Non-linear separability means that it is impossible to separate “1”s and “-1”s by a hyperplane

  5. Multi-valued mappings • The first artificial neurons could learn only Boolean functions. • However, the Boolean functions can describe only very limited class of problems. • Thus, the ability to learn and implement not only Boolean, but also multiple-valued and continuous functions is very important for solving pattern recognition, classification and approximation problems. • This determines the importance of those neurons that can learn and implement multiple-valued and continuous mappings

  6. 0.5 -1 1 Traditional approach to learn the multiple-valued mappings by a neuron: • Sigmoid activation function (the most popular):

  7. Sigmoidal neurons: limitations • Sigmoid activation function has a limited plasticity and a limited flexibility. • Thus, to learn those functions whose behavior is quite different in comparison with the one of the sigmoid function, it is necessary to create a network, because a single sigmoidal neuron is not able to learn such functions.

  8. Is it possible to overcome the Minsky’s-Papert’s limitation for the classical perceptron? Yes !!!

  9. We can overcome the Minsky’s-Papert’s limitation using the complex-valued weights and the complex activation function

  10. Is it possible to learn XOR and Parity n functions using a single neuron? • Any classical monograph/text book on neural networks claims that to learn the XOR function a network from at least three neurons is needed. • This is true for the real-valued neurons and real-valued neural networks. • However, this is not true for the complex-valued neurons !!! • A jump to the complex domain is a right way to overcome the Misky-Papert’s limitation and to learn multiple-valued and Boolean nonlinearly separable functions using a single neuron.

  11. NEURAL NETWORKS Complex-Valued Neurons Traditional Neurons Multi-Valued and Universal Binary Neurons Multi-Valued and Universal Binary Neurons Neuro-Fuzzy Networks Generalizations of Sigmoidal Neurons

  12. Complex numbers • Unlike a real number, which is geometrically a point on a line, a complex number is a point on a plane. • Its coordinates are called a real (Re, horizontal) and an imaginary (Im, vertical) parts of the number • i is an imaginary unity • r is the modulo (absolute value) of the number r Algebraic form of a complex number

  13. Complex numbers A unit circle φ is the argument (phase in terms of physics) of a complex number Trigonometric and exponential (Euler’s) forms of a complex number

  14. Complex numbers Complex-conjugated numbers

  15. XOR problem n=2, m=4 – four sectors W=(0, 1, i) – the weighting vector i -1 1 -i

  16. ε -1 1 1 -1 1 -1 -1 1 Parity 3 problem n=3, m=6 : 6 sectors W=(0, ε, 1, 1) – the weighting vector

  17. Multi-Valued Neuron (MVN) • A Multi-Valued Neuron is a neural element with n inputs and one output lying on the unit circle, and with the complex-valued weights. • The theoretical background behind the MVN is the Multiple-Valued (k-valued) Threshold Logic over the field of complex numbers

  18. Multi-valued mappings and multiple-valued logic • We traditionally use Boolean functions and Boolean (two-valued) logic, to present two-valued mappings: • To present multi-valued mappings, we should use multiple-valued logic

  19. Multiple-Valued Logic: classical view • The values of multiple-valued (k-valued) logic are traditionally encoded by the integers {0,1, …, k-1} • On the one hand, this approach looks natural. • On the other hand, it presents only the quantitative properties, while it can not present the qualitative properties.

  20. Multiple-Valued Logic: classical view • For example, we need to present different colors in terms of multiple-valued logic. Let Red=0, Orange=1, Yellow=2, Green=3, etc. • What does it mean? • Is it true that Red<Orange<Yellow<Green ??!

  21. Multiple-Valued (k-valued) logic over the field of complex numbers • To represent and handle both the quantitative properties and the qualitative properties, it is possible to move to the field of complex numbers. • In this case, the argument (phase) may be used to represent the quality and the amplitude may be used to represent the quantity

  22. 2 i 1  0 1 k-1 k-1 k-2 Multiple-Valued (k-valued) logic over the field of complex numbers regular values of k-valued logic one-to-one correspondence Thekth roots of unity are values of k-valued logic over the field of complex numbers primitive kth root of unity

  23. Important advantage • In multiple-valued logic over the field of complex numbers all values of this logic are algebraically (arithmetically) equitable: they are normalized and their absolute values are equal to1 • In the example with the colors, in terms of multiple-valued logic over the field of complex numbers they are coded by the different phases. Hence, their quality is presented by the phase. • Since the phase determines the corresponding frequency, this representation meats the physical nature of the colors.

  24. i 1 0 k-1 j-1 Z k-2 J j+1 Discrete-Valued (k-valued)Activation Function Function P maps the complex plane into the set of the kth roots of unity

  25. Discrete-Valued (k-valued)Activation Function k=16

  26. Multi-Valued Neuron (MVN) f is a function of k-valued logic (k-valued threshold function)

  27. MVN: main properties • The key properties of MVN: • Complex-valued weights • The activation function is a function of the argument of the weighted sum • Complex-valued inputs and output that are lying on the unit circle (kth roots of unity) • Higher functionality than the one for the traditional neurons (e.g., sigmoidal) • Simplicity of learning

  28. i s q MVN Learning • Learning is reduced to movement along the unit circle • No derivative is needed, learning is based on the error-correction rule - Desired output - Actual output - error, which completely determines the weights adjustment

  29. i s q Learning Algorithm for the Discrete MVN with the Error-Correction Learning Rule W – weighting vector; X - input vector is a complex conjugated to X αr– learning rate (should be always equal to 1) r - current iteration; r+1 – the next iteration is a desired output (sector) is an actual output (sector)

  30. i 1 Z Continuous-Valued Activation Function Continuous-valued case (k): Function P maps the complex plane into the unit circle

  31. Continuous-Valued Activation Function

  32. Continuous-Valued Activation Function

  33. i is a desired output is an actual output - neuron’s error Learning Algorithm for the Continuous MVN with the Error Correction Learning Rule W – weighting vector; X - input vector is a complex conjugated to X αr– a learning rate (should be always equal to 1) r - current iteration; r+1 – the next iteration Z – the weighted sum

  34. i is a desired output is an actual output - neuron’s error Learning Algorithm for the Continuous MVN with the Error Correction Learning Rule W – weighting vector; X - input vector is a complex conjugated to X αr– a learning rate (should be always equal to 1) r - current iteration; r+1 – the next iteration Z – the weighted sum

  35. - neuron’s error A role of the factor 1/(n+1)in the Learning Rule The weights after the correction: The weighted sum after the correction: - exactly what we are looking for

  36. i • is the absolute value of the weighted sum on the • previous (rth) iteration. 1 |z|>1 |z|<1 is a self-adaptive part of the learning rate Self-Adaptation of the Learning Rate 1/|zr| is a self-adaptive part of the learning rate

  37. Modified Learning Rules with the Self-Adaptive Learning Rate Discrete MVN 1/|zr| is a self-adaptive part of the learning rate Continuous MVN

  38. Convergence of the learning algorithm • It is proven that the MVN learning algorithm converges after not more than k! iterations for the k -valued activation function • For the continuous MVN the learning algorithm converges with the precision λafter not more than (π/λ)! iterations because in this case it is reduced to learning in π/λ–valued logic.

  39. MVN as a model of a biological neuron Excitation  High frequency Intermediate State  Medium frequency No impulses  Inhibition  Zero frequency • The State of a biological neuron is determined by the frequency of the generated impulses • The amplitude of impulses is always a constant

  40. MVN as a model of a biological neuron

  41. MVN as a model of a biological neuron Intermediate State Maximal inhibition 0 π 2π Maximal excitation IntermediateState

  42. MVN as a model of a biological neuron Maximal inhibition 0 π 2π Maximal excitation

  43. MVN: • Learns faster • Adapts better • Learns even highly nonlinear functions • Opens new very promising opportunities for the network design • Is much closer to the biological neuron • Allows to use the Fourier Phase Spectrum as a source of the features for solving different recognition/classification problems • Allows to use hybrid (discrete/continuous) inputs/output

More Related