460 likes | 631 Views
Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons. MVN and MLMVN. f (-1,1)= -1. f (1,1)= 1. 1. -1. 1. -1. f (1,-1)= -1. f (-1,-1)= - 1. A threshold function is a linearly separable function.
E N D
Multi-Valued Neurons and Multilayer Neural Network based on Multi-Valued Neurons MVN and MLMVN
f(-1,1)= -1 f(1,1)=1 1 -1 1 -1 f(1,-1)= -1 f(-1,-1)= -1 A threshold function is a linearly separable function Linear separability means that it is possible to separate “1”s and “-1”s by a hyperplane f (x1, x2)is the OR function
Threshold Boolean Functions • The threshold (linearly separable) function can be learned by a single neuron • The number of threshold functions is very small in comparison to the number of all functions (104 of 256 for n=3, about 2000 of 65536 for n=4, etc.) • Non-threshold (nonlinearly separable) functions can not be learned by a single neuron (Minsky-Papert, 1969), they can be learned only by a neural network
f(-1,1)= -1 f(1,1)=1 1 -1 1 -1 f(1,-1)= -1 f(-1,-1)=1 XOR – a classical non-threshold (non-linearly separable) function Non-linear separability means that it is impossible to separate “1”s and “-1”s by a hyperplane
Multi-valued mappings • The first artificial neurons could learn only Boolean functions. • However, the Boolean functions can describe only very limited class of problems. • Thus, the ability to learn and implement not only Boolean, but also multiple-valued and continuous functions is very important for solving pattern recognition, classification and approximation problems. • This determines the importance of those neurons that can learn and implement multiple-valued and continuous mappings
0.5 -1 1 Traditional approach to learn the multiple-valued mappings by a neuron: • Sigmoid activation function (the most popular):
Sigmoidal neurons: limitations • Sigmoid activation function has a limited plasticity and a limited flexibility. • Thus, to learn those functions whose behavior is quite different in comparison with the one of the sigmoid function, it is necessary to create a network, because a single sigmoidal neuron is not able to learn such functions.
Is it possible to overcome the Minsky’s-Papert’s limitation for the classical perceptron? Yes !!!
We can overcome the Minsky’s-Papert’s limitation using the complex-valued weights and the complex activation function
Is it possible to learn XOR and Parity n functions using a single neuron? • Any classical monograph/text book on neural networks claims that to learn the XOR function a network from at least three neurons is needed. • This is true for the real-valued neurons and real-valued neural networks. • However, this is not true for the complex-valued neurons !!! • A jump to the complex domain is a right way to overcome the Misky-Papert’s limitation and to learn multiple-valued and Boolean nonlinearly separable functions using a single neuron.
NEURAL NETWORKS Complex-Valued Neurons Traditional Neurons Multi-Valued and Universal Binary Neurons Multi-Valued and Universal Binary Neurons Neuro-Fuzzy Networks Generalizations of Sigmoidal Neurons
Complex numbers • Unlike a real number, which is geometrically a point on a line, a complex number is a point on a plane. • Its coordinates are called a real (Re, horizontal) and an imaginary (Im, vertical) parts of the number • i is an imaginary unity • r is the modulo (absolute value) of the number r Algebraic form of a complex number
Complex numbers A unit circle φ is the argument (phase in terms of physics) of a complex number Trigonometric and exponential (Euler’s) forms of a complex number
Complex numbers Complex-conjugated numbers
XOR problem n=2, m=4 – four sectors W=(0, 1, i) – the weighting vector i -1 1 -i
ε -1 1 1 -1 1 -1 -1 1 Parity 3 problem n=3, m=6 : 6 sectors W=(0, ε, 1, 1) – the weighting vector
Multi-Valued Neuron (MVN) • A Multi-Valued Neuron is a neural element with n inputs and one output lying on the unit circle, and with the complex-valued weights. • The theoretical background behind the MVN is the Multiple-Valued (k-valued) Threshold Logic over the field of complex numbers
Multi-valued mappings and multiple-valued logic • We traditionally use Boolean functions and Boolean (two-valued) logic, to present two-valued mappings: • To present multi-valued mappings, we should use multiple-valued logic
Multiple-Valued Logic: classical view • The values of multiple-valued (k-valued) logic are traditionally encoded by the integers {0,1, …, k-1} • On the one hand, this approach looks natural. • On the other hand, it presents only the quantitative properties, while it can not present the qualitative properties.
Multiple-Valued Logic: classical view • For example, we need to present different colors in terms of multiple-valued logic. Let Red=0, Orange=1, Yellow=2, Green=3, etc. • What does it mean? • Is it true that Red<Orange<Yellow<Green ??!
Multiple-Valued (k-valued) logic over the field of complex numbers • To represent and handle both the quantitative properties and the qualitative properties, it is possible to move to the field of complex numbers. • In this case, the argument (phase) may be used to represent the quality and the amplitude may be used to represent the quantity
2 i 1 0 1 k-1 k-1 k-2 Multiple-Valued (k-valued) logic over the field of complex numbers regular values of k-valued logic one-to-one correspondence Thekth roots of unity are values of k-valued logic over the field of complex numbers primitive kth root of unity
Important advantage • In multiple-valued logic over the field of complex numbers all values of this logic are algebraically (arithmetically) equitable: they are normalized and their absolute values are equal to1 • In the example with the colors, in terms of multiple-valued logic over the field of complex numbers they are coded by the different phases. Hence, their quality is presented by the phase. • Since the phase determines the corresponding frequency, this representation meats the physical nature of the colors.
i 1 0 k-1 j-1 Z k-2 J j+1 Discrete-Valued (k-valued)Activation Function Function P maps the complex plane into the set of the kth roots of unity
Multi-Valued Neuron (MVN) f is a function of k-valued logic (k-valued threshold function)
MVN: main properties • The key properties of MVN: • Complex-valued weights • The activation function is a function of the argument of the weighted sum • Complex-valued inputs and output that are lying on the unit circle (kth roots of unity) • Higher functionality than the one for the traditional neurons (e.g., sigmoidal) • Simplicity of learning
i s q MVN Learning • Learning is reduced to movement along the unit circle • No derivative is needed, learning is based on the error-correction rule - Desired output - Actual output - error, which completely determines the weights adjustment
i s q Learning Algorithm for the Discrete MVN with the Error-Correction Learning Rule W – weighting vector; X - input vector is a complex conjugated to X αr– learning rate (should be always equal to 1) r - current iteration; r+1 – the next iteration is a desired output (sector) is an actual output (sector)
i 1 Z Continuous-Valued Activation Function Continuous-valued case (k): Function P maps the complex plane into the unit circle
i is a desired output is an actual output - neuron’s error Learning Algorithm for the Continuous MVN with the Error Correction Learning Rule W – weighting vector; X - input vector is a complex conjugated to X αr– a learning rate (should be always equal to 1) r - current iteration; r+1 – the next iteration Z – the weighted sum
i is a desired output is an actual output - neuron’s error Learning Algorithm for the Continuous MVN with the Error Correction Learning Rule W – weighting vector; X - input vector is a complex conjugated to X αr– a learning rate (should be always equal to 1) r - current iteration; r+1 – the next iteration Z – the weighted sum
- neuron’s error A role of the factor 1/(n+1)in the Learning Rule The weights after the correction: The weighted sum after the correction: - exactly what we are looking for
i • is the absolute value of the weighted sum on the • previous (rth) iteration. 1 |z|>1 |z|<1 is a self-adaptive part of the learning rate Self-Adaptation of the Learning Rate 1/|zr| is a self-adaptive part of the learning rate
Modified Learning Rules with the Self-Adaptive Learning Rate Discrete MVN 1/|zr| is a self-adaptive part of the learning rate Continuous MVN
Convergence of the learning algorithm • It is proven that the MVN learning algorithm converges after not more than k! iterations for the k -valued activation function • For the continuous MVN the learning algorithm converges with the precision λafter not more than (π/λ)! iterations because in this case it is reduced to learning in π/λ–valued logic.
MVN as a model of a biological neuron Excitation High frequency Intermediate State Medium frequency No impulses Inhibition Zero frequency • The State of a biological neuron is determined by the frequency of the generated impulses • The amplitude of impulses is always a constant
MVN as a model of a biological neuron Intermediate State Maximal inhibition 0 π 2π Maximal excitation IntermediateState
MVN as a model of a biological neuron Maximal inhibition 0 π 2π Maximal excitation
MVN: • Learns faster • Adapts better • Learns even highly nonlinear functions • Opens new very promising opportunities for the network design • Is much closer to the biological neuron • Allows to use the Fourier Phase Spectrum as a source of the features for solving different recognition/classification problems • Allows to use hybrid (discrete/continuous) inputs/output