ECEC 453 Image Processing Architecture

ECEC 453Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory, Quantizers and DCT Oleh Tretiak Drexel University

Quality - Rate Tradeoff • Given: 512x512 picture, 8 bits per pixel • Bit reduction • Fewer bits per pixel • Fewer pixels • Both • Issues: • How do we measure compression? • Bits/pixel — does not work when we change number of pixel • Total bits — valid, but hard to interpret • How do we measure quality? • RMS noise • Peak signal to noise ratio (PSR) in dB • Subjective quality

Comparison, Bit and Pixel Reduction

Quantizer Performance • Questions: • How much error does the quantizer introduce (Distortion)? • How many bits are required for the quantized values (Rate)? • Rate: • 1. No compression. If there are N possible quantizer output values, then it takes ceiling(log2N) bits per sample. • 2(a). Compression. Compute the histogram of the quantizer output. Design Huffman code for the histogram. Find the average lentgth. • 2(b). Find the entropy of the quantizer distribution • 2(c). Preprocess quantizer output, .... • Distortion: Let x be the input to the quantizer, x* the de-quantized value. Quantization noise n = x* - x. Quantization noise power is equal to D = Average(n2).

S S Quantizer: practical lossy encoding • Quantizer • Symbols x — input to quantizer, q — output of quantizer,S — quantizer step • Quantizer: q = round(x/S) • Dequantizer characteristic x* = Sq • Typical noise power added by quantizer-dequantizer combination: D = S2/12 noise standard deviation s = sqrt(D) = 0.287S Example: S = 8, D = 82/12 = 5.3, rms. quatization noise = sqrt(D) = 2.3 If input is 8 bits, max input is 255. There are 255/8 ~ 32 quantizer output values PSNR = 20 log10(255/2.3) = 40.8 dB Quantizer characteristic Dequantizer characteristic

Rate-Distortion Theorem • When long sequences (blocks) are encoded, it is possible to construct a coder-decoder pair that achieves the specified distortion whenever bits per sample are R(D) + e. • Formula: X ~ Gaussian random variable, Q = E[X2] ~ signal power • D = E[(X–Y)2 ] ~ noise power

This Lecture • Decorrelation and Bit Allocation • Discrete Cosine Transform • Video Coding

Coding Correlated Samples • How to code correlated samples • Decorrelate • Code • Methods for decorrelation • Prediction • Transformation • Block transform • Wavelet transform

Prediction Rules • Simplest: previous value

General Predictive Coding • General System • Example of linear predictive image coder

Rate-distortion theory — correlated samples • Given: x = (x1, x2, ... xn), a sequence of Gaussian correlated samples • Preprocess: convert to y = (y1, y2, ... yn), y = Ax, A ~ an orthogonal matrix (A-1 = AT) that de-correlates the samples. This is called a Karhunen-Loeve transformation • Perform lossy encoding of (y1, y2, ... yn) - get y* = (y1*, y2*, ... yn*) after decoding • Reconstruct: x* = A-1y*

Block-Based Coding • Discrete Cosine Transform (DCT) is used instead of the K-L transform • Full image DCT - one set of decorrelated coefficients for whole image • Block-based coding: • Image divided into ‘small’ blocks • Each block is decorrelated separately • Block decorrelation performs almost as well (better?) than full image decorrelation • Current standards (JPEG, MPEG) use 8x8 DCT blocks

Rate-distortion theory: non-uniform random variables • Given (x1, x2, ... xn), use orthogonal transform to obtain (y1, y2, ... yn). • Sequence of independent Gaussian variables (y1, y2, ... yn), Var[yi ] = Qi. • Distortion allocation: allocate Di distortion to Qi • Rate (bits) for i-th variable is Ri = max[0.5 log2(Qi/Di), 0] • Total distortion • Total rate (bits) • We specify R. What are the values of Di to get minimum total distortion D?

Bit allocation solution • Implicit solution (water-filling construction) • Choose Q (parameter) • Di= min(Qi, Q) • If (Qi > Q) then Di = Q, else Di= Qi • Ri = max[0.5 log2(Qi/Di), 0] • If (Qi > Q) then Ri=0.5 log2(Qi/ Q), else Ri= 0. • Find value of Qto get specified R

Do example on blackboard

Wavelet Transform • Filterbank and wavelets • 2 D wavelets • Wavelet Pyramid

50 50 Filterbank and Wavelets • Put signal (sequence) through two filters • Low frequencies • High frequencies • Downsample both by factor of 2 • Do it in such a way that the original signal can be reconstructed! 100 100

Filterbank Pyramid 125 125 250 500 1000

2D Wavelets • Apply wavelet processing along rows of picture Apply wavelet processing along columns of picture Pyramid processing

48.81 9.23 1.01 15.45 6.48 2.52 0.37 Lena: Top Level, next level

Lena, more levels

Decorrelation of Images • x = (x1, x2, ... xn), a sequence of image gray values • Preprocess: convert to y = (y1, y2, ... yn), y = Ax, A ~ an orthogonal matrix (A-1 = AT) • Theoretical best (for Gaussian process): A is the Karhunen-Loeve transformation matrix • Images are not Gaussian processes • Karhunen-Loeve matrix is image-dependent, computationally expensive to find • Evaluating y = Ax with K-L transformation is computationally expensive • In practice, we use DCT (discrete cosine transform) for decorrelation • Computationally efficient • Almost as good as the K-L transformation

DPCM • Simple to implement (low complexity) • Prediction: 3 multiplications and 2 additions • Estimation: 1 addition • Encoding: 1 addition + quantization • Performance for 2-D coding not as good as block quantization • In theory, for large past history the performance (rate-distortion) should be as good as other linear methods, but in that case there is no computational advantage • Bottom line: useful when complexity is limited • Important idea: Lossy predictive encoding.

Review: Image Decorrelation • x = (x1, x2, ... xn), a sequence of image gray values • Preprocess: convert to y = (y1, y2, ... yn), y = Ax, A ~ an orthogonal matrix (A-1 = AT) • Theoretical best (for Gaussian process): A is the Karhunen-Loeve transformation matrix • Images are not Gaussian processes • Karhunen-Loeve matrix is image-dependent, computationally expensive to find • Evaluating y = Ax with K-L transformation is computationally expensive • In practice, we use DCT (discrete cosine transform) for decorrelation • Computationally efficient • Almost as good as the K-L transformation

Rate-Distortion: 1D vs. 2D coding • Theory on tradeoff between distortion and least number of bits • Interesting tradeoff only if samples are correlated • “Water-filling” construction to compute R(d)

Review: Block-Based Coding • Full image DCT - one set of decorrelated coefficients for whole image • Block-based coding: • Image divided into ‘small’ blocks • Each block is decorrelated separately • Block decorrelation performs almost as well (better?) than full image decorrelation • Current standards (JPEG, MPEG) use 8x8 DCT blocks

What is the DCT? Note: in these equations, p stands for p. • One-dimensional 8 point DCT Input x0, ... x7, output y0, ... y7 • One-dimensional inverse DCT Input y0, ... y7, output x0, ... x7 • Matrix form of equations: x, y are one column matrices

Two-Dimensional DCT • Forward 2DDCT. Input xij i = 0, ... 7, j = 0, ... 7. Output ykl k = 0, ... 7, l = 0, ... 7 • Matrix form, X, Y ~ 8x8 matrices with coefficients xij , ykl • The 2DDCT is separable! Note: in these equations, p stands for p.

General DCT • One dimension • Two dimensions

Example: 3x3 DCT

ECEC 453 Image Processing Architecture

ECEC 453 Image Processing Architecture

Presentation Transcript

Image processing

Image Processing

Multiprocessor Architecture for Image processing

Image Processing

Image Processing

Image Processing

Image Processing

ECE-C453 Image Processing Architecture

ECEC-453 Image Processing Architecture

Image processing

Image Processing

Image Processing

Image Processing

Multiprocessor Architecture for Image Processing

Image Processing

Image Processing

ECEC 453 Image Processing Architecture

Image Processing

ECE-C490 Winter 2004 Image Processing Architecture

image processing

Image Processing

Image Processing