Unsupervised recurrent networks

Unsupervised recurrent networks Barbara Hammer, Institute of Informatics, Clausthal University of Technology

Clausthal - Zellerfeld Brocken

Prototype-based clustering …

Prototype based clustering • data contained in a real-vector space • prototypes characterized by locations in the data space • clustering induced by the receptive fields based on the euclidean metric

Vector quantization • init prototypes • repeat • present a data point • adapt the winner into the direction of the data point

Cost function • minimizes the cost function • online: stochastic gradient descent 

wj wj Neighborhood cooperation Self-Organizing Map: regular lattice Neural gas: data optimum topology j=(j1,j2)  

Clustering recurrent data …

Old models…

Old models Temporal Kohonen Map: leaky integration x1,x2,x3,x4,…,xt, … d(xt,wi) = |xt-wi| + α·d(xt-1,wi) training: wi  xt Recurrent SOM: d(xt,wi) = |yt| where yt = (xt-wi) + α·yt-1 training: wi  yt

Our model…

Merge neural gas/SOM explicit temporal context xt-1,xt-2,…,x0 xt,xt-1,xt-2,…,x0 xt (w,c) |xt – w|2 |Ct - c|2 merge-context:: content of the winner Ct training: w  xt c  Ct

(wj,cj) in ℝnxn Merge neural gas/SOM • explicit context, global recurrence • wj : represents entry xt • cj: repesents the context which equals the winner content of the last time step • distance: d(xt,wj) = α·|xt-wj| + (1-α)·|Ct-cj| where Ct = γ·wI(t-1) + (1-γ)·cI(t-1), I(t-1) winner in step t-1 (merge) • trainingwj xt, cj Ct

Merge neural gas/SOM Example: 42  33 33 34 C1 = (42 + 50)/2 = 46 C2= (33+45)/2 = 39 C3= (33+38)/2 = 35.5

Merge neural gas/SOM • speaker identification, japanese vovel ‘ae’ [UCI-KDD archive] • 9 speakers, 30 articulations each time 12-dim. cepstrum MNG, 150 neurons: 2.7% test error MNG, 1000 neurons: 1.6% test error rule based: 5.9%, HMM: 3.8%

Merge neural gas/SOM Experiment: • classification of donor sites for C.elegans • 5 settings with 10000 training data, 10000 test data, 50 nucleotides TCGA embedded in 3 dim, 38% donor [Sonnenburg, Rätsch et al.] • MNG with posterior labeling • 512 neurons, γ=0.25, η=0.075, α: 0.999  [0.4,0.7] • 14.06%±0.66% training error, 14.26%±0.39% test error • sparse representation: 512 · 6 dim

Merge neural gas/SOM Theorem – context representation: Assume • a map with merge context is given (no neighborhood) • a sequence x0, x1, x2, x3,… is given • enough neurons are available Then: • the optimum weight/context pair for xt is w = xt, c = ∑i=0..t-1 γ(1-γ)t-i-1·xi • Hebbian training converges to this setting as a stable fixed point • Compare to TKM: • optimum weights are w = ∑i=0..t (1-α)i·xt-i / ∑i=0..t (1-α)i • but: no fixed point for TKM • MSOM is the correct implementation of TKM

More models…

More models what is the correct temporal context ? xt,xt-1,xt-2,…,x0 (w,c) |xt – w|2 xt |Ct - c|2 Context: RSOM/TKM – neuron itself MSOM – winner content SOMSD – winner index RecSOM – all activations Ct training: w  xt c  Ct xt-1,xt-2,…,x0

More models * for normalised WTA context

More models Experiment: • Mackey-Glass time series • 100 neurons • different lattices • different contexts • evaluation by the temporal quantization error: average(mean activity k steps into the past - observed activity k steps into the past)2

More models SOM quantization error RSOM NG RecSOM SOMSD HSOMSD MNG now past

So what?

So what? • inspection / clustering of high-dimensional events within their temporal context could be possible • strong regularization as for standard SOM / NG • possible training methods for reservoirs • some theory • some examples • no supervision • the representation of context is critical and not clear at all • training is critical and not clear at all

Unsupervised recurrent networks

Unsupervised recurrent networks

Presentation Transcript

Recurrent neural networks (I)

III. Recurrent Neural Networks

Discovering Recurrent Events in Multi-channel Data Streams using Unsupervised Methods

Learning in Recurrent Networks

Learning linguistic structure with simple recurrent networks

Recurrent Neural Networks or Associative Memories

Generating Text with Recurrent Neural Networks

Learning in Recurrent Networks

Unsupervised learning Networks

Recurrent Networks

Speech Sound Production: Recognition Using Recurrent Neural Networks

RECURRENT NEURAL NETWORKS

Chapter 5 Recurrent Networks and Temporal Feedforward Networks

Genetic Specification of Recurrent Neural Networks: Initial Thoughts

Recurrent Networks

Unsupervised Learning Networks

Recurrent Neural Networks

8. Recurrent associative networks and episodic memory

Unsupervised Learning Networks

Artificial neural networks – Unsupervised learning