180 likes | 355 Views
Variational Bayesian Image Processing on Stochastic Factor Graphs. Xin Li Lane Dept. of CSEE West Virginia University. Outline. Statistical modeling of natural images From old-fashioned local models to newly-proposed nonlocal models Factor graph based image modeling
E N D
Variational Bayesian Image Processing on Stochastic Factor Graphs Xin Li Lane Dept. of CSEE West Virginia University
Outline • Statistical modeling of natural images • From old-fashioned local models to newly-proposed nonlocal models • Factor graph based image modeling • A powerful framework unifying local and nonlocal approaches • EM-based inference on stochastic factor graphs • Applications and experimental results • Denoising, inpainting, interpolation, post-processing, inverse halftoning, deblurring ... ...
Cast Signal/Image Processing Under a Bayesian Framework • Image restoration (Besag et al.’1991) • Image denoising (Simoncelli&Adelson’1996) • Interpolation (Mackay’1992) and super-resolution (Schultz& Stevenson’1996 ) • Inverse halftoning (Wong’1995) • Image segmentation (Bouman&Shapiro’1994) Likelihood (varies from application to application) Image prior (the focus of this talk) x: Unobservable data y: Observation data
Statistical Modeling of Natural Images:the Pursuit of a Good Prior • Local models • Markov Random Field (MRF) and its extensions (e.g., 2D Kalman-filtering, Field-of-Expert) • Sparsity-based: DCT, wavelets, steerable pyramids, geometric wavelets (edgelets, curvelets, ridgelets, bandelets) • Nonlocal models • Bilateral filtering (Tomasi et al. ICCV’1998) • Texture synthesis (Efros&Leung ICCV’1999) • Exemplar-based inpainting (Criminisi et al. TIP’2004) • Nonlocal mean denoising (Buades et al.’ CVPR’2005) • Total Least-Square denoising (Hirakawa&Parks TIP’2006) • Block-matching 3D denoising (Dabov et al. TIP’2007)
Introducing a New Language of Factor Graphs • Why Factor Graphs? • The most general form of graphical probability models (both MRF and Bayesian networks can be converted to FGs) • Widely used in computer science and engineering (forward-backward algorithm, Viterbi algorithm, turbo decoding algorithm, Pearl’s belief propagation algorithm, Kalman filter1) • What is Factor Graph? • a bipartite graph that expresses which variables are arguments of which local functions • Factor/function node (solid squares) vs. variable nodes (empty circles) f1 f2 f3 f4 L:F V f1 1,2,4 f2 3,6 f3 5,7 B1 B2 B3 B4 B5 B6 B7 B8 f4 7,8 1Kschischang, F.R.; Frey, B.J.; Loeliger, H.-A., "Factor graphs and the sum-product algorithm," IEEE Transactions onInformation Theory,, vol.47, no.2, pp.498-519, Feb 2001
Variable Nodes=Image Patches • Neuroscience: receptive fields of neighboring cells in human vision system have severe overlapping • Engineering: patch has been under the disguise of many different names such as windows in digital filters, blocks in JPEG and the support of wavelet bases Cited from D. Hubel, “Eye, Brain and Vision”, 1988
Factorization: the Art of Statistical Image Modeling ML SP Range-Markovian Locally linear embedding1 (perceptual similarity defines the neighborhood) Domain-Markovian Wavelet-based statistical models (geometric proximity defines the neighborhood) • 1S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality • Reduction by Locally Linear Embedding” • (22 December 2000),Science290 (5500), 2323.
Unification Using Factor Graphs B1 x f1 f2 f3 f4 B2 B3 B1 B2 B3 B4 B0 naive Bayesian (DCT/wavelet-based models) B0 B1 B2 B0 B1 B2 B3 B3 kNN/kmeans clustering (nonlocal image models) MRF-based
A Manifold Interpretation of Nonlocal Image Prior MRN B0 B1 Bk How to maximize the sparsity of a representation? Conventional wisdom: adapt basis to signal (e.g., basis pursuit, matching pursuit) New proposal: adapt signal to basis (by probing its underlying organization principle)
Organizing Principle: Latent Variable L fC fB fA image denoising L B11 B12 B13 B14 image inpainting B21 B22 B23 B24 B31 B32 B33 B34 B41 B42 B43 B44 x y P(y|x) image coding image halftoning sparsifying transform image deblurring L “Nature is not economical of structures but organizing principles.” - Stanislaw M. Ulam
B0 B1 Bk … Maximum-Likelihood Estimation of Graph Structure L loop over every factor node fj Pack into 3D Array D For. Trans. Update the estimate of L Update the estimate of x P(y|x) Coring Inv. Trans. ^ ^ ^ B0 B1 Bk … unpack into 2D patches A variational interpretation of such EM-based inference on FGs is referred to the paper
Problem 1: Image Denoising PSNR(DB) PERFORMANCE COMPARISON AMONG DIFFERENT SCHEMES FOR 12 TEST IMAGES ATσw = 100 SSIM PERFORMANCE COMPARISON AMONG DIFFERENT SCHEMES FOR 12 TEST IMAGES ATσw = 100 BM3D (kNN,iter=2) SFG (kmeans,iter=20) σw org. 200 400 600 800 1000
Problem 2: Image Recovery DCT FoE EXP BM3D LSP SFG x y top-down: test1, test3, test5 PSNR(dB) performance comparison SSIM performance comparison Local models: DCT, FoE and LSP Nonlocal models: EXP, BM3D1 and SFG 1Our own extension into image recovery top-down: test2, test4, test6
Problem 3: Resolution Enhancement FG x y bicubic NEDI1 31.76dB 32.36dB 32.63dB 34.71dB 34.45dB 37.35dB 28.70dB 27.34dB 28.19dB 18.81dB 15.37dB 16.45dB 1X. Li and M. Orchard, “New edge directed interpolation”, IEEE TIP, 2001
Problem 4: Irregular Interpolation x y KR FG1 DT 29.06dB 31.56dB 34.96dB DT- Delauney Triangle-based (griddata under MATLAB) KR- Kernal Regression-based (Takeda et al. IEEE TIP 2007 w/o parameter optimization) 28.46dB 31.16dB 36.51dB 26.04dB 24.63dB 29.91dB 25% kept 17.90dB 18.49dB 29.25dB 1X. Li, “Patch-based image interpolation: algorithms and applications,” Inter. Workshop on Local and Non-Local Approximation (LNLA)’2008
Problem 5: Post-processing SFG-enhanced at rate of 0.32bpp (PSNR=33.22dB) JPEG-decoded at rate of 0.32bpp (PSNR=32.07dB) SPIHT-decoded at rate of 0.20bpp (PSNR=26.18dB) SFG-enhanced at rate of 0.20bpp (PSNR=27.33dB) Maximum-Likelihood (ML) Decoding Maximum a Posterior (MAP) Decoding
Problem 6: Inverse Halftoning without nonlocal prior1 (PSNR=31.84dB, SSIM=0.8390) with nonlocal prior (PSNR=32.82dB, SSIM=0.8515) 1Available from Image Halftoning Toolbox released by UT-Austin Researchers
Conclusions and Perspectives • Despite the rich structures in natural images, the underlying organization principle is simple (self-similarity • We have shown how similarity can lead to sparsity in a nonlinear representation of images • FG only represents one mathematical language for interpreting such principle (multifractal formalism is another) • Image processing (low-level vision) could benefit from data clustering (higher-level vision): how does human visual cortex learn to decode the latent variable L through unsupervised learning? Reproducible Research: MATLAB codes accompanying this work are available at http://www.csee.wvu.edu/~xinl/sfg.html (more will be added)