Realizing an autonomous recognizer using data compression

Realizing an autonomous recognizer using data compression ESA-EUSC-JRC 2011, ISPRA, Varese, Italy, 2011.03.31 渡辺　俊典Toshinori WatanabeProfessor, Dr. EngGrad. School of Inf. Sys., UEC,Tokyo, Japan watanabe@is.uec.ac.jp komkumei@gmail.com T. Watanabe with a long komuso-shakuhachi ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Story • Revisit the recognition problem • What is recognition? • Low level recognition as FSQ (Feature Space Quantization) • Clarify open problems in FSQ design • Propose an autonomous FSQ • Compressibility-based general feature space using PRDC • Case-based nonlinear feature space quantizer TAMPOPO • CSOR : Compression-based Self Organizing Recognizer ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

What is recognition? Low level labels High level labels Signals Low level mapping High level mapping • Approximately, it is a mapping cascade • Low level : from input signals to low level labels • High level : from low level labels to high level ones ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Low level recognition example road Naked land Forest Houses Output: set of labels Input: set of signals ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Low level recognition as the problem of FSQ Feature space building sea grass square Partitioning& labeling Quantized & labeled feature space Feature space • Representatives • ADALINE (Adaptive Linear Threshold) • SVM (Support Vector Machine) • SOM (Self Organizing Map), etc. • They are all feature space quantizers (FSQ) ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Open problems in FSQ design Cases Feature 2 Feature 2 Feature space Feature 1 Feature 1 • Two basic elements of FSQ • Preparation of a set of bases to span the feature space • Preparation of a method to partition the space using observed vectors, i.e., cases ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Open problems in FSQ design • Feature space design • Color histogram, Fourier coeff., Shape moments • Problem-specific , not general • Quantizor design • Linear/nonlinear, Offline/online • Model-respecting, memory-saving, not individual (case)-respecting ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

My proposals : How to realize a highly autonomous FSQ • Compression-based general feature space by PRDC • Compressibility feature space • Autonomous feature space generation process • Source signal textization • Case-based feature space quantization by TAMPOPO • CSOR: Possible autonomous FSQ ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Text (sequence) featuring paradigm 1 : Statistical information theory of Shanon H(XY) H(X|Y) H(Y) H(X) I(X;Y) • Tries to characterize a statistical set X • Self entropy H(X) = ∑ p(x) log p(x) • Joint entropy H(XY) of X and Y • Conditional entropy H(X|Y) • Mutual information I(X;Y) • I(X;Y) = H(X) + H(Y) - H(XY) • H(XY) = H(X) + H(Y) - I(X;Y) • Required to know occurrence probabilities of the target texts ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Text (sequence) featuring paradigm 2 : Algorithmic information theory (AIT) of Kolmogorov • Tries to give the complexity of an individual text x • K(x) = min( sizeP{ A(P)=x | A: some algorithm} ) • K(x) is not statistical but defined on an individual x • K(x) has similar properties as H(x) • K(x) can’t be calculated • AIT is the “heavenly fire of Zeus” ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

LZ coding by Zip and Lempel • An approximation device to calculate K(x) • The rate R of new phrase appearance when x is compressed by a self-delimiting encoder • Proved that R = H(x) for long texts • They were the Prometheus who brought K(x) down to our earth ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

PRDC • Early trial to exploit individual object’s complexity in real world problem solving • General media data featuring scheme • Compressibility vector (CV) space • Spanned by compression dictionaries D1, D2, …, Dn CV = (ρ(X|D1), ρ(X|D2), …, ρ(X|Dn)) • For generality enhancement • Pre-textization • LZ-type text compression ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Where is PRDC ? Non-parametric, algorithmic Algorithmic IT： Kolmogorov, et.al. K(x) Algorithmic IT based similarity： Li, et.al., Datcu, Cerra, Gueguen, Mallet. PRDC： Watanabe LZ compressor： Zip & Lempel Real Conceptual Statistical inf. theory： H(X) Shannon Parametric, statistical ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Overview of PRDC Compressibility vectors Image Dictionary- based text compression Textization Sound Text Feature space Others Applications Media-specific methods Pivotal representation ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Dictionary based compression : LZW Root b a 0 1 Dic. Compress aabababaaa Current place cursor ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan Initial state

First cycle b 0 a 0 1 Dic. Compress a 2 bababaaa a a Next start point ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Second cycle b 0 0 a 0 1 Dic. Compress a b 2 3 ababaaa a a b ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Final state b 001352 a 0 1 Dic. Compress a b a 4 2 3 aabababaaa a 5 a 6 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

b a 0 1 a b a 4 2 3 a 5 a 6 Behavior summary aabababaaa • Input • Output • CR (Compression Ratio) = |output| / |input| = 6/10 = 0.6 • Dictionary 001352 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Another example • Dictionary bbbbbbbbbb • Input • Output • CR (Compression Ratio) = 4/10 = 0.4 (aabababaaa: 0.6) a b 1234 0 1 b 2 b 3 b 4 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Compressibility vector space • What will happen if • We compress a text Ty by a dictionary D(Tx) of another text Tx • By using the LZW* method • LZW* uses D(Tx) in freeze mode • Experiment • T1 = aabababaaa, T3 = aaaabaaaab • T2 = bbbbbbbbbb, T4 = bbbbabbbba • Dictionaries = (D(T1), D(T2)) ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Compressibility vector space CR by D(T2) T1=aabababaaa 1 0.75 0.5 0.25 0 T1 T3 T3=aaaabaaaab T2=bbbbbbbbbb T4=bbbbabbbba T4 T2 0 0.25 0.5 0.75 1 CR by D(T1) ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Fact 1: Local properties of CV space CR by D(T2) 1 0.75 0.5 0.25 0 Known to T1 only Unknown to both Known to both Known to T2 only CR by D(T1) 0 0.25 0.5 0.75 1 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Fact 2: Similar bases cause low resolution CR by D(bbbbbbbbbb) 1 0.75 0.5 0.25 0 T1 T3 T4 T2 0 0.25 0.5 0.75 1 CR by D(bbbbbbbbbb) ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Fact 3:Concatenated text causes low resolution CR by D(T12=T1T2=aabababaaabbbbbbbbbb) 1 0.75 0.5 0.25 0 T12 T2 T4 T1 T3 T12 0 0.25 0.5 0.75 1 CR by D(T1=aabababaaa)

Fact 4:Splitting can enhance resolution D(T1=aabababaaa) D(T12=aabababaaabbbbbbbbbb) D(aabababaaa) D(T2=bbbbbbbbbb) D(aabababaaa) D(aabababaaa)

Autonomous CV space generation process • Define the CV space at step k as CVS(k) = [D(k), F(k)] • D(k) is the list of current base dictionaries at step k • F(k) is the list of current foreign segments at step k • Rewrite CVS(k) as follows forever • Get an input text segment x (of reasonable length) • Branch by cases • Case1) d* in D(k) nicely compresses x then • If d* is full then D(k+1) = D(k), F(k+1) = F(k) • If d* is not full then D(k+1) = D(k) – d* + ed*, F(k+1) = F(k) % ed* : d* enlarged by x • Case2) x is foreign to D(k) and F(k) is not full, then add x to F(k) • D(k+1) = D(k), F(k+1) = F(k) + x • Case3) x is foreign to D(k) and F(k) is full then extend D(k) by using F(k) • Let dd* dictionaries generated by ff* in F(k) by LZW %ff* : set of large similar groups • D(k+1) = D(k) + dd*, F(k+1) = F(k) – ff*. ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Autonomous CV space generator : diagram Texts Length Splitter Segments Current foreign segments Current basis dictionaries Rewriter Feature-space-based application ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Source signal textizaiton : Image 0 2 0 0 Graph with color difference edge weights Pixel array Image-MST Image-MST (Minimum Spanning Tree) ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Source signal textizaiton : Image T = abbbabbbcdccffee a c e d b f Image-MST Output texts Encoding table ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan Textization by MST traversal

Case-based feature space quantization by TAMPOPO L2 L4 L1 Quantization L3 Quantized feature space with local labels Feature space with case data ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan • Goal • Incremental non-linear quantization under successive case data arrival

Possible scheme • TAMPOPO learning machine • TAMPOPO is the Japanese of DANDELION • Duplication AND DELEtiON scheme for learning • Basic ideas • Individual case data representation mimicking the snow-cap shape • Evolutional rewriting of the case database • Nonlinear mapping formation by territories of cases ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

The life of TAMPOPO Live Die Live High land Water Meadow Sand Die ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

The shape of a TAMPOPO Fitness score Upper fur : my possible worst score function Smaller is better My data (label) Lower fur : my possible best score function Seed F2 My fitness score Root : my key (feature) F1 Feature space ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Superior / Inferior / Incomparable relation Score T2 C2 T1 C1 f Feature space T1 is superior to T2, as the possible score range of T1 is always better than that of T2 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Acquisition of the mapping : F  C Score T4 T2 T1 T3 C1 C2 C2 C1 Feature space F1 F2 F3 F4 Non-liner mapping acquired: F1  C1, F2  C2, F3  C2, F4  C1 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Rewrite : Recall the best with duplication (2) Recall T3 arg.best(worst(f*)) Score T2 T1 T3 C2 Copy of T3 T3* C1 C1 C1 FS f* (1) Input query vector f* f* ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Rewrite : Modify the seed , output and get the score T3* T3* T3* Modify C C1* Score j* C1* C1 f* f* f* Apply c1* to environment and get the fitness score j* ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Rewrite : Implant it with inferior deletion Score Score T2 T1 T1 T3 T3 T3* T3* C2 T3* C1* C1* C1 C1 C1 C1 FS FS f* Old DB New DB f* The inferior T2 of T3* is deleted ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Evolutional rewriting of individuals Feature vector Rewriter • Recall the best element by duplication • Modify its seed vector, output and get its score • Implant it with inferior deletion Old DB New DB Environmental score Output vector ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

How to get the mapping :Feature  Label for FSQ Score Score Jc C3 Jd C1 C1 C2 C1 f* FS FS Aging • Introduce a new threshold score Jc • If arg.best(worst(f*)) < Jc then recall it, else implant a new child with a new label and a default score Jd at f* • (Possibly) Add an aging score to all ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

CSOR: Possible Autonomous FSQ Signal source Textizer Autonomous feature (CV) space generator Texts Compression Current foreign segments Current basis dictionaries Feature vectors (CVs) Autonomous feature (CV) space quantizer TAMPOPO DB Recognized labels ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Application : Land cover analysis 1 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Application : Land cover analysis 2 ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Summary • In this presentation • I have picked up the problem of low level recognizer design in highly autonomous mode • It is a feature space quantizer (FSQ) construction problem • A possible solution CSOR is proposed • CSOR : Compression-based Self-Organizing Recognizer • Main components • A general compression-based feature space using PRDC • An online feature space quantizer based on TAMPOPO • CSOR is highly data-respecting and fits to the modern computer with rich memory ESA-EUSC-JRC 2011, Keynote Speech, T. Watanabe, U.E.C, Tokyo, Japan

Realizing an autonomous recognizer using data compression

Realizing an autonomous recognizer using data compression

Presentation Transcript

Data Compression

Data Compression

Data Compression

Data Compression

Isolated Digit Recognizer using GMM’s

Data Compression using square root scaling

Isolated Digit Recognizer using GMM’s

Data Compression

Data Compression

Data Compression

Data compression

Data Compression

Data Compression

An introduction to Data Compression

Data Compression

Data Compression

Data Compression

Data Compression