510 likes | 693 Views
Duarte Wakin Sarvotham Baraniuk Guo Shamai. Compressed Sensing meets Information Theory. Dror Baron drorb@ee.technion.ac.il www.ee.technion.ac.il/people/drorb. Technology Breakthroughs. Sensing, Computation, Communication fast, readily available, cheap
E N D
DuarteWakinSarvothamBaraniukGuoShamai Compressed Sensing meets Information Theory Dror Baron drorb@ee.technion.ac.il www.ee.technion.ac.il/people/drorb
Technology Breakthroughs • Sensing,Computation, Communication • fast, readily available, cheap • Progress in individual disciplines (computing, networks, comm, DSP, …)
The Data Deluge • Challenges: • Exponentially increasing amounts of data • myriad different modalities (audio, image, video, financial, seismic, weather …) • global scale acquisition • Analysis/processing hampered by slowing Moore’s law • finding “needle in haystack” • Energy consumption • Opportunities (today)
Sensing by Sampling • Sampledata at Nyquist rate (2x highest frequency in signal) • Compress data using model (e.g., sparsity) • Lots of work to throw away >90% of the coefficients • Most computation at sensor (asymmetrical) • Brick wall to performance of modern acquisition systems sample compress transmit/store sparse wavelet transform receive decompress
Sparsity / Compressibility largewaveletcoefficients largeGaborcoefficients pixels widebandsignalsamples • Many signals are sparse in some representation/basis (Fourier, wavelets, …)
Compressed Sensing • Shannon/Nyquist sampling theorem • must sample at 2x highest frequency in signal • worst case bound for any bandlimited signal • too pessimistic for some classes of signals • does not exploit signal sparsity/compressibility • Seek direct sensing of compressible information • Compressed Sensing (CS) • sparse signals can be recovered from a small number of nonadaptive (fixed) linear measurements • [Candes et al.; Donoho; Rice,…]
Compressed Sensing via Random Projections • Measure linear projections onto randombasis where data is not sparse • mild “over-sampling” in analog • Decode (reconstruct) via optimization • Highly asymmetrical (most computation at receiver) project transmit/store receive decode
CS Encoding • Replace samples by more general encoderbased on a few linear projections (inner products) sparsesignal measurements # non-zeros
Universality via Random Projections • Random projections • Universal for any compressible/sparse signal class sparsesignal measurements # non-zeros
Optical Computation of Random Projections[Rice DSP 2006] • CS measurements directly in analog • Single photodiode
First Image Acquisition ideal 64x64 image (4096 pixels) 400wavelets image onDMD array 1600random meas.
CS Signal Decoding • Goal: find x given y • Ill-posedinverse problem • Decoding approach • search over subspace of explanations to measurements • find “most likely” explanation • universality accounted for during optimization • Linear program decoding [Candes et al., Donoho] • small number of samples • computationally tractable • Variations • greedy (matching pursuit) [Tropp et al., Needell et al.,...] • optimization [Hale et al., Figueiredo et al.]
CS Hallmarks • CS changes rules of data acquisition game • exploits a priori sparsity information to reduce #measurements • Hardware/software: Universality • same random projections forany compressible signal class • simplifies hardware and algorithm design • Processing: Information scalability • random projections ~ sufficient statistics • same random projections for range of tasks • decoding > estimation > recognition > detection • far fewer measurements required to detect/recognize • Next generation data acquisition • new imaging devices • new distributed source coding algorithms[Baron et al.]
CS meets Information Theoretic Bounds [Sarvotham, Baron, & Baraniuk 2006] [Guo, Baron, & Shamai 2009]
Fundamental Goal: Minimize • Compressed sensing aims to minimize resource consumption due to measurements • Donoho: “Why go to so much effort to acquire all the data when most of what we get will be thrown away?”
Signal Model • Signal entry Xn= BnUn • iid Bn» Bernoulli() sparse • iid Un» PU PX Bernoulli() Multiplier PU
Non-Sparse Input • Can use =1 Xn= Un PU
Measurement Noise • Measurement process is typically analog • Analog systems add noise, non-linearities, etc. • Assume Gaussian noise for ease of analysis • Can be generalized to non-Gaussian noise
Noise Model • Noiseless measurements denoted y0 • Noise • Noisy measurements • Unit-norm columns SNR= noiseless SNR
CS Analog to Communication System [Sarvotham, Baron, & Baraniuk 2006] source encoder channel decoder channel encoder source decoder channel CS decoding CS measurement • Model process as measurement channel • Measurements provide information!
Single-Letter Bounds • Theorem:[Sarvotham, Baron, & Baraniuk 2006] For sparse signal with rate-distortion function R(D), lower bound on measurement rate s.t. SNR and distortion D • Numerous single-letter bounds • [Aeron, Zhao, & Saligrama] • [Akcakaya & Tarokh] • [Rangan, Fletcher, & Goyal] • [Gastpar & Reeves] • [Wang, Wainwright, & Ramchandran] • [Tune, Bhaskaran, & Hanly] • …
Goal: Precise Single-letter Characterization of Optimal CS[Guo, Baron, & Shamai 2009]
What Single-letter Characterization? , channel posterior • Ultimately what can one say about Xn given Y? • (sufficient statistic) • Very complicated • Want a simple characterization of its quality • Large-system limit:
Main Result: Single-letter Characterization[Guo, Baron, & Shamai 2009] • Result1:Conditioned on Xn=xn, the observations (Y,) are statistically equivalent to easy to compute… • Estimation quality from (Y,) just as good as noisier scalar observation , channel posterior degradation
Details • 2(0,1) is fixed point of • Take-home point: degraded scalar channel • Non-rigorous owing to replica method w/ symmetry assumption • used in CDMA detection [Tanaka 2002, Guo & Verdu 2005] • Related analysis [Rangan, Fletcher, & Goyal 2009] • MMSE estimate (not posterior) using [Guo & Verdu 2005] • extended toseveral CS algorithms particularly LASSO
Decoupling Result[Guo, Baron, & Shamai 2009] • Result2: Large system limit; any arbitrary (constant) L input elements decouple: • Take-home point: individual posteriors statistically independent
Why is Decoding Expensive? Culprit: dense, unstructured sparsesignal measurements nonzeroentries
Sparse Measurement Matrices [Baron, Sarvotham, & Baraniuk 2009] • LDPC measurement matrix (sparse) • Mostly zeros in ; nonzeros » P • Each row contains ¼Nq randomly placed nonzeros • Fast matrix-vector multiplication • fast encoding / decoding sparse matrix
CS Decoding Using BP[Baron, Sarvotham, & Baraniuk 2009] • Measurement matrix represented by graph • Estimate real-valued input iteratively • Implemented via nonparametric BP [Bickson,Sommer,…] signal x measurements y
Identical Single-letter Characterization w/BP[Guo, Baron, & Shamai 2009] • Result3: Conditioned on Xn=xn, the observations (Y,) are statistically equivalent to • Sparse matrices just as good • Result4:BP is asymptotically optimal! identical degradation
Decoupling Between Two Input Entries (N=500, M=250, =0.1, =10) density
CS-BP vs Other CS Methods (N=1000, =0.1, q=0.02) MMSE CS-BP M
CS-BP is O(Nlog2(N)) (M=0.4N, =0.1, =100, q=0.04) Runtime [seconds] N
Setting • LDPC measurement matrix (sparse) • Fast matrix-vector multiplication • Assumptions: • noiseless measurements • strictly sparse signal sparsesignal measurements nonzeroentries
Example 0 1 1 4 ? ? ? ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1
Example • What does zero measurement imply? • Hint: x strictly sparse 0 1 1 4 ? ? ? ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1
Example • Graph reduction! 0 1 1 4 ? 0 0 ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1
Example • What do matching measurements imply? • Hint: non-zeros in x are real numbers 0 1 1 4 ? 0 0 ? ? ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1
Example • What is the last entry of x? 0 1 1 4 0 0 0 0 1 ? 0 1 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 00 00 1 1
Main Results [Sarvotham, Baron, & Baraniuk 2006] • # nonzeros per row • # measurements • Fast encoder and decoder • sub-linear decoding complexity • Can be used for distributed content distribution • measurements stored on different servers • any M measurements suffice • Strictly sparse signals, noiseless measurements
Related Direction: Linear Measurements • unified theory for linear measurement systems
Linear Measurements in Finance • Fama and French three factor model (1993) • stock returns explained by linear exposure to factors • e.g., “market” (change in stock market index) • numerous factors can be used (e.g., earnings to price) • Noisy linear measurements exposures factor returns stock returns unexplained (typically big)
Financial Prediction • Explanatory power ¼ prediction (can invest on this) • Goal: estimate x to explain y well • Financial prediction vs CS: longer y, shorter x • Sounds easy, nonetheless challenging • NOISY data need lots of measurements • nonlinear, nonstationary financial prediction compressed sensing
Application Areas for Linear Measurements • DSP (CS) • Finance • Medical imaging (tomography) • Information retrieval • Seismic imaging (oil industry)
Unified Theory of Linear Measurement • Common goals • minimal resources • robustness • computationally tractable • Inverse problems • Striving toward theory and efficient processing in linear measurement systems