Sparse and Overcomplete Data Representation

Sparse and Overcomplete Data Representation Michael Elad The CS Department The Technion – Israel Institute of technology Haifa 32000, Israel Israel Statistical Association 2005 Annual Meeting Tel-Aviv University (Dan David bldg.) May 17th, 2005

Welcome to Sparseland Agenda • A Visit to Sparseland • Motivating Sparsity & Overcompleteness • 2. Answering the 4 Questions • How & why should this work? Sparse and Overcomplete Data Representation

Every column in D (dictionary) is a prototype data vector (Atom). N • The vector is generated randomly with few non-zeros in random locations and random values. N A sparse & random vector K A fixed Dictionary Data Synthesis in Sparseland M Sparse and Overcomplete Data Representation

M Multiply by D Sparseland Data is Special • Simple:Every generated vector is built as a linear combination of fewatoms from our dictionaryD • Rich:A general model: the obtained vectors are a special type mixture-of-Gaussians (or Laplacians). Sparse and Overcomplete Data Representation

M • Assume that x is known to emerge from . . M • How about “Given x, find the α that generated it in ” ? M T Transforms in Sparseland ? • We desire simplicity, independence, and expressiveness. Sparse and Overcomplete Data Representation

A sparse & random vector Multiply by D • Is ? Under which conditions? • Are there practical ways to get ? 4 Major Questions Difficulties with the Transform • How effective are those ways? • How would we get D? Sparse and Overcomplete Data Representation

Sparselandis HERE Several recent trends from signal/image processing worth looking at: Sparsity. • JPEG to JPEG2000 - From (L2-norm) KLT to wavelet and non-linear approximation Sparsity. • From Wiener to robust restoration – From L2-norm (Fourier) to L1. (e.g., TV, Beltrami, wavelet shrinkage …) Overcompleteness. • From unitary to richer representations – Frames, shift-invariance, bilateral, steerable, curvelet • Approximation theory – Non-linear approximation Sparsity & Overcompleteness. Independence and Sparsity. • ICA and related models Why Is It Interesting? Sparse and Overcomplete Data Representation

Agenda • 1. A Visit to Sparseland • Motivating Sparsity & Overcompleteness • 2. Answering the 4 Questions • How & why should this work? T Sparse and Overcomplete Data Representation

M Multiply by D Suppose we can solve this exactly Why should we necessarily get ? It might happen that eventually . Question 1 – Uniqueness? Sparse and Overcomplete Data Representation

Definition:Given a matrix D, =Spark{D} is the smallestand and number of columns that are linearly dependent. Donoho & Elad (‘02) Example: Spark = 3 Matrix “Spark” Rank = 4 Sparse and Overcomplete Data Representation

Suppose this problem has been solved somehow Uniqueness If we found a representation that satisfy then necessarily it is unique (the sparsest). Donoho & Elad (‘02) M This result implies that if generates vectors using “sparse enough” , the solution of the above will find it exactly. Uniqueness Rule Sparse and Overcomplete Data Representation

Multiply by D Are there reasonable ways to find ? Question 2 – Practical P0 Solver? M Sparse and Overcomplete Data Representation

Matching Pursuit (MP) Mallat & Zhang (1993) • The MPis a greedy algorithm that finds one atom at a time. • Next steps: given the previously found atoms, find the next one to best fit … • Step 1: find the one atom that best matches the signal. • The Orthogonal MP (OMP) is an improved version that re-evaluates the coefficients after each round. Sparse and Overcomplete Data Representation

Instead of solving Solve Instead Basis Pursuit (BP) Chen, Donoho, & Saunders (1995) • The newly defined problem is convex. • It has a Linear Programming structure. • Very efficient solvers can be deployed: • Interior point methods [Chen, Donoho, & Saunders (`95)] , • Sequential shrinkage for union of ortho-bases [Bruce et.al. (`98)], • If computing Dx and DT are fast, based on shrinkage [Elad (`05)]. Sparse and Overcomplete Data Representation

Multiply by D How effective are the MP/BP in finding ? Question 3 – Approx. Quality? M Sparse and Overcomplete Data Representation

D = DT DTD Mutual Coherence • Compute Assume normalized columns • The Mutual Coherenceµ is the largest entry in absolute value outside the main diagonal of DTD. • The Mutual Coherence is a property of the dictionary (just like the “Spark”). The smaller it is, the better the dictionary. Sparse and Overcomplete Data Representation

BP and MP Equivalence Equivalence Given a vector x with a representation , Assuming that , BP and MP are Guaranteed to find the sparsest solution. Donoho & Elad (‘02) Gribonval & Nielsen (‘03) Tropp (‘03) Temlyakov (‘03) • MP is typically inferior to BP! • The above result corresponds to the worst-case. • Average performance results are available too, showing much better bounds [Donoho (`04), Candes et.al. (`04), Elad and Zibulevsky (`04)]. Sparse and Overcomplete Data Representation

M Multiply by D Skip? Question 4 – Finding D? • Given these P examples and a fixed size [NK] dictionary D: • Is D unique?(Yes) • How to find D? • Train!! The K-SVD algorithm Sparse and Overcomplete Data Representation

D X  A Each example is a linear combination of atoms from D Each example has a sparse representation with no more than L atoms Training: The Objective Sparse and Overcomplete Data Representation

D Initialize D Sparse Coding Use MP or BP X Dictionary Update Column-by-Column by SVD computation The K–SVD Algorithm Aharon, Elad, & Bruckstein (`04) Sparse and Overcomplete Data Representation

Today We Discussed • A Visit to Sparseland • Motivating Sparsity & Overcompleteness • 2. Answering the 4 Questions • How & why should this work? Sparse and Overcomplete Data Representation

There are difficulties in using them! The dream? Summary Sparsity and Over- completeness are important ideas that can be used in designing better tools in data/signal/image processing • We are working on resolving those difficulties: • Performance of pursuit alg. • Speedup of those methods, • Training the dictionary, • Demonstrating applications, • … Future transforms and regularizations will be data-driven, non-linear, overcomplete, and promoting sparsity. Sparse and Overcomplete Data Representation

Sparse and Overcomplete Data Representation

Sparse and Overcomplete Data Representation

Presentation Transcript

Data Representation

Sparse Representation and Compressed Sensing: Theory and Algorithms

Data Representation

Optimal sparse representations in general overcomplete bases

Image Super-resolution via Sparse Representation

Sparse representation for coarse and fine object recognition

Image processing based on sparse representation

Sparse Representation

Data Representation

Sparse Representation for Image Reconstruction, and Face Recognition?

Data Representation

Data Representation

Data Representation

Face recognition via sparse representation

DATA REPRESENTATION

DATA REPRESENTATION

Data Representation

MMSE Estimation for Sparse Representation Modeling

Machine Learning for Signal Processing Sparse and Overcomplete Representations