Wavelet & Neurowavelet

Wavelet & Neurowavelet Provided by: Seyed Ehsan Safavieh

What is wavelet? • Wavelet transforms a signal into components of different frequencies, allowing to study each component separately. • The basic idea of wavelet transform like other transforms is mapping the signal from one basis to another. • These basis function that called baby wavelet are dilated transfered versions of a function called mother wavelet.

Introduction to waveletsDWT • Wavelet in each step separate the signal into two components: trend & fluctuations. • Example(Haar wavelet): • Trends: • Fluctuations:

Introduction to waveletsDWT • So we have some “scaling” and “wavelet” signals that fluctuations and trends are scalar product of main signal with shifted versions of them:

Introduction to waveletsDWT • We can write like this: • And more steps:

Introduction to waveletsDWT • Energy distribution:

Introduction to waveletsCWT • Fourier transform: • Some information of signal are not available through time domain so we need frequency domain of signal. • Fourier transform provide frequency amplitude plot of signal that say what portion of each frequency exists. • Fourier transform maps the signal to some complex trigonometric basis.

Introduction to waveletsCWT • x(t)=cos(2*pi*10*t)+cos(2*pi*25*t)+cos(2*pi*50*t)+cos(2*pi*100*t)

Introduction to waveletsCWT

Introduction to waveletsCWT • Fourier transform has no information about where in the signal some frequency occurs. • It is good for stationary signals and where we only need what frequencies exist. • Sometimes we need more locality in our analyze.

Introduction to waveletsCWT • Short Term Fourier Transform (STFT): • It unlike FT shows what band of frequencies exist in which time intervals. • It look at the signals from enough narrow windows that the portion of signal seen from these windows are indeed stationary.

Introduction to waveletsCWT • HeisenbergUncertainty Principle • Cannot know what frequency exists at what time intervals • In FT there was no resolution problem. • In time domain we know exact time • In frequency domain we know exact frequency • Dilemma of Resolution • Narrow window -> poor frequency resolution • Wide window -> poor time resolution

Introduction to waveletsCWT • The solution is using approach named MRA • It analyzes the signal at different frequencies with different resolutions: • High frequencies: high time resolution & low frequency resolution • Low frequencies: low time resolution & high frequency resolution • This approach make sense when signal has high frequency components for short time and low frequency components for long duration.

Introduction to waveletsCWT • Transformed signal is a function of “transfer” and “scale” • Wavelet • Small wave • Means the window function is of finite length • Mother Wavelet ψ(t) • A prototype for generating the other window functions • All the used windows are its dilated or compressed and shifted versions Transfer (The location of the window) Scale

Introduction to waveletsCWT • Scale • S>1: dilate the signal • S<1: compress the signal • Low Frequency -> High Scale -> Non-detailed Global View of Signal • High Frequency -> Low Scale -> Detailed View Last in Short Time • Only Limited Interval of Scales is Necessary

Heisenberg uncertainty explains this area can’t be less than π/4 Equal surfaces

Wavelet theory • A refinable function is a function φ: R→C which satisfies a recursion relation of the form: • are recursion cofficients. • The refinable function φ is called orthogonal if: • This function is also called “scaling function” or “father wavelet”

Φ(x) Φ(2x) Φ(2x-1) 1 1 1 0 1/2 1 0 1/2 1 0 1/2 1 Wavelet theory • Haar recursion relation • It satisfies: h0=h1=1/√2

Wavelet theory • Let φj,k(x) be translated and dilated function defined by: • j corresponds to the dilation along x axis by the factor 2j and k to translation along x axis by 2j/k • the factor 2j is to preserve the energy i.e.

Wavelet theory • Let V0 be the subspace of L2(R) spanned by the orthonormal set {φ0,k(x)}. • All piecewise constant functions in L2(R) with jump discontinuities at the set of integers Z belongs to V0. • Let V1 be the subspace of piecewise constant functions in L2(R) with jump discontinuities at 1/2 Z. • The set {φ1,k(x)} is an orthonormal basis of V1. • Similarly the set {φj,k(x)} is an orthonormal basis of Vj, the subspace of piecewise constant functions with jump discontinuities at 1/2j Z.

Wavelet theory • MRA: • is dense in L2(R) • is {0} • There exists a function φ in L2(R) so that {φ(x-k)} forms a stable basis of V0 • The MRA is called orthogonal if φ is orthogonal.

Wavelet theory • The orthogonal projection of function f in L2 onto Vn is: • We say that the function in Vn have resolution or scale 2-n • Pnf is called an approximation to f at resolution 2-n • Lemma: for any function in L2 , Pnf→f as n→infinity • The difference between approximation at level 2-n and 2-n-1 is called fine detail at resolution 2-n

Wavelet theory • Let V be an inner product space and W subspace of V . Then, the orthogonal complement of W is • Let Wj be the orthogonal complement of Vj in Vj+1 so: • ⊕is direct sum of vector spaces • Wk is the range of Qk

Wavelet theory • ⊕nWn is dense in L2 • Wi ┴ Wj if i≠j • There exist a function ψ in L2 that {ψ} forms an orthogonal stable basis of W0 • Since ψ∊ V1 it can be represented as:

Wavelet theory • gk = (-1)khN-k • N is some odd number • The function ψ is called wavelet function or mother wavelet. • Haar: • Ψ(x)=φ(2x)-φ(2x-1) • {g0,g1}={h0,-h1}={1/√2,- 1/√2}

Wavelet theory • Daubechies wavelet D2

Wavelet theory • Given f in L2 we can represent it as: • Assuming f ∊ Vn :

Wavelet theory • Suppose: • Let’s write z terms in Haar basis of V2 • So we are looking for c1,…,c4∊ C such that:

Wavelet theory • Or • And so: • And based on first chapter we have:

Daubechies D3 • Daubechies Symlet

Coiflet K1 • Coiflet K2

Wavelet networks • Some definitions: • Metric space: is a 2_tuple (X,d) where X is a set and d is a metric on it. • Inner product space: a vector space with specified inner product on it that satisfies: • 1 • 2 • 3 • 4 • Inner product space have naturally norm:

Wavelet networks • Seminorm: on vector space V is a function p:V→R that: • 1 • 2 • 3 • Norm: is a seminorm that: p(v)=0 iff v=0. • Cauchy sequence: a sequence {xn} in a seminormed space V,p is called Cauchy if p(xm-xn) goes to 0 where m,n goes to infinity. • Complete space: a metric space M is said complete if every Cauchy sequence of points in M has a limit in M. • Hilbert space: a complete normed inner product vector space. • Completeness ensures that limits exist when expected, which facilitates various definitions from calculus

Wavelet networks • We are interested in subspace of H called L2 • For every function in L2 have:

Wavelet networks • Wavelet networks are 1-1/2 layer networks with wavelet as their activation function • Most wavelet network properties are due to the wavelet localization properties • Wavelet network can first be trained to learn lowest resolution level and then train to include elements at higher resolutions • The wavelet theory provides useful guidelines for the construction and initialization of networks and consequently the training times are significantly reduced

σ σ σ ψ ψ … … ψ Wavelet networks • WNN structures: • Multilayer WNN • Using more complex network with one layer wavelons

Wavelet networks • WNN with scaling functions • It uses both father and mother wavelet functions as its activation function • Really it uses following equation

Wavelet networks • WRNN • It uses wavelet functions as activation function for some of inside neurons that are not connected to output • All neurons are fully connected and receive inputs but some are connected to desired output

Wavelet networks • WSOM • It uses wavelet layer after the competitive layer • Every node in the wavelet layer has a discrete wavelet function associated with it • An approximation for the L2 functions of the input space is given both by the SOM layer and the wavelet layer

Wavelet networks • WRBF • It uses wavelet functions as its activation function instead of Gaussian function • Gaussian like wavelet functions usually used in this type of network (e.g. mexihat wavelet)

Wavelet networks • Initialization methods • Vachtsevanos use trigonometric wavelets of the form • costrap(x) = cos(3B/2x) min{ max{3/2(1-|x|),0} ,1} • Trigonometric wavelet can be approximated by polynomials which is a linear problem • The fitted parameters of polynomials can be used as wavelet initial parameters • Echuaz use aclustering method to position the wavelets where distribution of points around a cluster approximates the necessary dilation of the wavelet • Boubez & Pestkin first initialize the network by positioning low resolution wavelets and then introducing higher resolution wavelets

Wavelet networks • Rao, Kumthekar use an architecture to train by adding new wavelets at each step until convergence is reached • Yi Yu, Tan, Vanderwalle, Deprettere first use a large number of functions and then compact the wavelet by using a shrinking technique to keep only important nodes • Zhang, Benveniste use an orthogonal least squares procedure as explained bellow • Suppose we are approximating function f of domain D(a,b) by a network of the form: • Set g to mean of f(x) • Set wi ‘s simply to zero

Wavelet networks • For initiating ti , si ‘s • First select a point p in interval [a,b] as below: • The t1 , s1 are set: • ξ is usually set to 1/2 • Point p breaks the interval into 2 part so the same algorithm can be used for determining t2,s2 , t3,s3 and so on

Wavelet networks • WNN learning: • All algorithms that guaranty the convergence and avoiding local minima for neural networks can be used for WNN because dilation and translation parameters are now network parameters • Zhang, Benveniste use stochastic gradient algorithm • Szu, Telfer, Kadambe use conjugate gradient method • Standard gradient descent algorithm may be used since the wavelet function is diffrentiable with respect to all parameters • Prochazka, Sys use genetic algorithm

Wavelet networks • WNN properties: • Because of wavelet decomposition, wavelet networks provide universal approximation properties • Kreinovich have shown that WNNs are asymptotically optimal approximators for functions of one variable • The wavelet networks for certain classes of problems achieve the same quality of approximation as neural networks with reduced size parameters

Wavelet networks • Suppose u(x) be the step function that can be used in neural networks as activation function and is in the form: • It is very similar to Haar scaling function such that • It shows that step function through dilation and translation is equal to Haar wavelet • Two neurons of feedforward neural network with step function can make up a Haar wavelet

Wavelet & Neurowavelet