Duarte Wakin Sarvotham Baraniuk Guo Shamai

DuarteWakinSarvothamBaraniukGuoShamai Compressed Sensing meets Information Theory Dror Baron drorb@ee.technion.ac.il www.ee.technion.ac.il/people/drorb

Technology Breakthroughs • Sensing,Computation, Communication • fast, readily available, cheap • Progress in individual disciplines (computing, networks, comm, DSP, …)

The Data Deluge

The Data Deluge • Challenges: • Exponentially increasing amounts of data • myriad different modalities (audio, image, video, financial, seismic, weather …) • global scale acquisition • Analysis/processing hampered by slowing Moore’s law • finding “needle in haystack” • Energy consumption • Opportunities (today)

From Sampling to Compressed Sensing

Sensing by Sampling • Sampledata at Nyquist rate (2x highest frequency in signal) • Compress data using model (e.g., sparsity) • Lots of work to throw away >90% of the coefficients • Most computation at sensor (asymmetrical) • Brick wall to performance of modern acquisition systems sample compress transmit/store sparse wavelet transform receive decompress

Sparsity / Compressibility largewaveletcoefficients largeGaborcoefficients pixels widebandsignalsamples • Many signals are sparse in some representation/basis (Fourier, wavelets, …)

Compressed Sensing • Shannon/Nyquist sampling theorem • must sample at 2x highest frequency in signal • worst case bound for any bandlimited signal • too pessimistic for some classes of signals • does not exploit signal sparsity/compressibility • Seek direct sensing of compressible information • Compressed Sensing (CS) • sparse signals can be recovered from a small number of nonadaptive (fixed) linear measurements • [Candes et al.; Donoho; Rice,…]

Compressed Sensing via Random Projections • Measure linear projections onto randombasis where data is not sparse • mild “over-sampling” in analog • Decode (reconstruct) via optimization • Highly asymmetrical (most computation at receiver) project transmit/store receive decode

CS Encoding • Replace samples by more general encoderbased on a few linear projections (inner products) sparsesignal measurements # non-zeros

Universality via Random Projections • Random projections • Universal for any compressible/sparse signal class sparsesignal measurements # non-zeros

Optical Computation of Random Projections[Rice DSP 2006] • CS measurements directly in analog • Single photodiode

First Image Acquisition ideal 64x64 image (4096 pixels) 400wavelets image onDMD array 1600random meas.

CS Signal Decoding • Goal: find x given y • Ill-posedinverse problem • Decoding approach • search over subspace of explanations to measurements • find “most likely” explanation • universality accounted for during optimization • Linear program decoding [Candes et al., Donoho] • small number of samples • computationally tractable • Variations • greedy (matching pursuit) [Tropp et al., Needell et al.,...] • optimization [Hale et al., Figueiredo et al.]

CS Hallmarks • CS changes rules of data acquisition game • exploits a priori sparsity information to reduce #measurements • Hardware/software: Universality • same random projections forany compressible signal class • simplifies hardware and algorithm design • Processing: Information scalability • random projections ~ sufficient statistics • same random projections for range of tasks • decoding > estimation > recognition > detection • far fewer measurements required to detect/recognize • Next generation data acquisition • new imaging devices • new distributed source coding algorithms[Baron et al.]

CS meets Information Theoretic Bounds [Sarvotham, Baron, & Baraniuk 2006] [Guo, Baron, & Shamai 2009]

Fundamental Goal: Minimize • Compressed sensing aims to minimize resource consumption due to measurements • Donoho: “Why go to so much effort to acquire all the data when most of what we get will be thrown away?”

Signal Model • Signal entry Xn= BnUn • iid Bn» Bernoulli()  sparse • iid Un» PU PX Bernoulli() Multiplier PU

Non-Sparse Input • Can use =1  Xn= Un PU

Measurement Noise • Measurement process is typically analog • Analog systems add noise, non-linearities, etc. • Assume Gaussian noise for ease of analysis • Can be generalized to non-Gaussian noise

Noise Model • Noiseless measurements denoted y0 • Noise • Noisy measurements • Unit-norm columns  SNR= noiseless SNR

CS Analog to Communication System [Sarvotham, Baron, & Baraniuk 2006] source encoder channel decoder channel encoder source decoder channel CS decoding CS measurement • Model process as measurement channel • Measurements provide information!

Single-Letter Bounds • Theorem:[Sarvotham, Baron, & Baraniuk 2006] For sparse signal with rate-distortion function R(D), lower bound on measurement rate s.t. SNR  and distortion D • Numerous single-letter bounds • [Aeron, Zhao, & Saligrama] • [Akcakaya & Tarokh] • [Rangan, Fletcher, & Goyal] • [Gastpar & Reeves] • [Wang, Wainwright, & Ramchandran] • [Tune, Bhaskaran, & Hanly] • …

Goal: Precise Single-letter Characterization of Optimal CS[Guo, Baron, & Shamai 2009]

What Single-letter Characterization?  , channel posterior • Ultimately what can one say about Xn given Y? • (sufficient statistic) • Very complicated • Want a simple characterization of its quality • Large-system limit:

Main Result: Single-letter Characterization[Guo, Baron, & Shamai 2009] • Result1:Conditioned on Xn=xn, the observations (Y,) are statistically equivalent to  easy to compute… • Estimation quality from (Y,) just as good as noisier scalar observation  , channel posterior degradation

Details • 2(0,1) is fixed point of • Take-home point: degraded scalar channel • Non-rigorous owing to replica method w/ symmetry assumption • used in CDMA detection [Tanaka 2002, Guo & Verdu 2005] • Related analysis [Rangan, Fletcher, & Goyal 2009] • MMSE estimate (not posterior) using [Guo & Verdu 2005] • extended toseveral CS algorithms particularly LASSO

Decoupling

Decoupling Result[Guo, Baron, & Shamai 2009] • Result2: Large system limit; any arbitrary (constant) L input elements decouple: • Take-home point: individual posteriors statistically independent

Sparse Measurement Matrices

Why is Decoding Expensive? Culprit: dense, unstructured sparsesignal measurements nonzeroentries

Sparse Measurement Matrices [Baron, Sarvotham, & Baraniuk 2009] • LDPC measurement matrix (sparse) • Mostly zeros in ; nonzeros » P • Each row contains ¼Nq randomly placed nonzeros • Fast matrix-vector multiplication • fast encoding / decoding sparse matrix

CS Decoding Using BP[Baron, Sarvotham, & Baraniuk 2009] • Measurement matrix represented by graph • Estimate real-valued input iteratively • Implemented via nonparametric BP [Bickson,Sommer,…] signal x measurements y

Identical Single-letter Characterization w/BP[Guo, Baron, & Shamai 2009] • Result3: Conditioned on Xn=xn, the observations (Y,) are statistically equivalent to • Sparse matrices just as good • Result4:BP is asymptotically optimal! identical degradation

Decoupling Between Two Input Entries (N=500, M=250, =0.1, =10) density

CS-BP vs Other CS Methods (N=1000, =0.1, q=0.02) MMSE CS-BP M

CS-BP is O(Nlog2(N)) (M=0.4N, =0.1, =100, q=0.04) Runtime [seconds] N

Fast CS Decoding [Sarvotham, Baron, & Baraniuk 2006]

Setting • LDPC measurement matrix (sparse) • Fast matrix-vector multiplication • Assumptions: • noiseless measurements • strictly sparse signal sparsesignal measurements nonzeroentries

Example 0 1 1 4 ? ? ? ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1

Example • What does zero measurement imply? • Hint: x strictly sparse 0 1 1 4 ? ? ? ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1

Example • Graph reduction! 0 1 1 4 ? 0 0 ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1

Example • What do matching measurements imply? • Hint: non-zeros in x are real numbers 0 1 1 4 ? 0 0 ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1

Example • What is the last entry of x? 0 1 1 4 0 0 0 0 1 ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 00 00 1 1

Main Results [Sarvotham, Baron, & Baraniuk 2006] • # nonzeros per row • # measurements • Fast encoder and decoder • sub-linear decoding complexity • Can be used for distributed content distribution • measurements stored on different servers • any M measurements suffice • Strictly sparse signals, noiseless measurements

Related Direction: Linear Measurements • unified theory for linear measurement systems

Linear Measurements in Finance • Fama and French three factor model (1993) • stock returns explained by linear exposure to factors • e.g., “market” (change in stock market index) • numerous factors can be used (e.g., earnings to price) • Noisy linear measurements exposures factor returns stock returns unexplained (typically big)

Financial Prediction • Explanatory power ¼ prediction (can invest on this) • Goal: estimate x to explain y well • Financial prediction vs CS: longer y, shorter x • Sounds easy, nonetheless challenging • NOISY data  need lots of measurements • nonlinear, nonstationary financial prediction compressed sensing

Application Areas for Linear Measurements • DSP (CS) • Finance • Medical imaging (tomography) • Information retrieval • Seismic imaging (oil industry)

Unified Theory of Linear Measurement • Common goals • minimal resources • robustness • computationally tractable • Inverse problems • Striving toward theory and efficient processing in linear measurement systems

Duarte Wakin Sarvotham Baraniuk Guo Shamai

Duarte Wakin Sarvotham Baraniuk Guo Shamai

Presentation Transcript

Shriram Sarvotham Dror Baron Richard Baraniuk

Duarte Diagrams

Richard Baraniuk

Richard G. Baraniuk Chinmay Hegde

Richard Baraniuk Rice University

Richard Baraniuk

Zhaoqi Guo

Zhong Guo

Volkan Cevher Marco Duarte Richard Baraniuk

Richard Baraniuk Mark Davenport Marco Duarte Chinmay Hegde Rice University Michael Wakin

Richard Baraniuk Rice University

Richard Baraniuk

Dr.Vinay Sarvotham

Duarte Wakin Sarvotham Baraniuk Guo Shamai