Image Compression, Transform Coding & the Haar Transform

Image Compression, Transform Coding & the Haar Transform 4c8 – Dr. David Corrigan

Entropy • It all starts with entropy

Calculating the Entropy of an Image The entropy of lena is = 7.57 bits/pixel approx The maximum the entropy could be is 8 bits, so it doesn’t look like much compression would be possible.

Huffman Coding • Huffman is the simplest entropy coding scheme • It achieves average code lengths no more than 1 bit/symbol of the entropy • A binary tree is built by combining the two symbols with lowest probability into a dummy node • The code length for each symbol is the number of branches between the root and respective leaf

Huffman Coding of Lenna Average Code Word Length = So the code length is not much greater than the entropy

But this is not very good • Why? • Entropy is not the minimum average codeword length for a source with memory • If the other pixel values are known we can predict the unknown pixel with much greater certainty and hence the effective (ie. conditional) entropy is much less. • Entropy Rate • The minimum average codeword length for any source. • It is defined as

Coding Sources with Memory • It is very difficult to achieve codeword lengths close to the entropy rate • In fact it is difficult to calculate the entropy rate itself • You looked at LZW in 3C5 as a practical coding algorithm • Average codeword length tends to the entropy rate if the file is large enough • Efficiency is improved if we use Huffman to encode the output of LZW • LZ algorithms used in lossless compression formats (eg. .tiff, .png, .gif, .zip, .gz, .rar… )

Efficiency of Lossless Compression • Lenna (256x256) file sizes • Uncompressed tiff - 64.2 kB • LZW tiff – 69.0 kB • Deflate (LZ77 + Huff) – 58 kB • Green Screen (1920 x 1080) file sizes • Uncompressed – 5.93 MB • LZW – 4.85 MB • Deflate – 3.7 MB

Differential Coding • Key idea – code the differences in intensity. G(x,y) = I(x,y) – I(x-1,y)

Differential Coding Calculate Difference Image Huffman Enoding Channel Image Recon-struction Huffman Decoding The entropy is now 5.60 bits/pixel which is much less than 7.57 bits/pixel we had before (despite having twice as many symbols)

So why does this work? Plot a graph of H(p) against p.

In general • Entropy of a source is maximised when all signals are equiprobable and is less when a few symbols are much more probable than the others. Entropy = 5.6 bits/pixel Entropy = 7.57 bits/pixel Histogram of the original image Histogram of the difference image

Lossy Compression • But this is still not enough compression • Trick is to throw away data that has the least perceptual significance Effective bit rate = 8 bits/pixel Effective bit rate = 1 bit/pixel (approx)

Lossy Transform Coding Lossless Lossy Lossless Lossless

Signal Energy

Energy Compaction with Xforms

The Haar Xform LoLo Hi-Lo Lo-Hi Hi-Hi

Implementation Details • When displaying the haar transform for the image the mid gray value represents 0 (except for the Lo-Lo Band). • Colour Images are processed by treating each colour channel as separate gray scale images. • If YUV colourspace is used subsampling of the U and V channels is probable. Subsampling is done before the Haar Transform is taken.

Quantisation • After we create the image we quantise the transform coefficients. • Step size is normally decided by perceptual evaluation • We can assign different step sizes to the different bands. • We can use different step sizes for the different colour channels. • We will consider a uniform step size, Qstep, for each band for now.

Entropy Qstep = 15

Entropy Qstep = 15 • Calculating the overall entropy is trickier • Each coefficient in a band represents 4 pixel locations in the original image. • So bits/pixel = (bits/coefficient)/4 • So the entropy of the transformed and quantised lenna is

Mistake in Fig. 5 of handout Red Dashed Line is the Histogram. Blue bars represent the “entropies” (ie. - p * log2(p) ) and not vice versa

Multilevel Haar Xform

Calculating the Entropy for Level 2 of the transform • One Level 1 coefficient represents 4 pixels • One level 2 coefficient represents 16 pixels Total Entropy = 1.70 bits/pixel Qstep = 15

Multilevel Haar Xform

Calculating the Entropy for Level 3 of the transform • One Level 1 coefficient represents 4 pixels • One level 2 coefficient represents 16 pixels • One level 3 coefficient represents 64 pixels Qstep = 15

Calculating the Entropy for Level 3 of the transform • One Level 1 coefficient represents 4 pixels • One level 2 coefficient represents 16 pixels • One level 3 coefficient represents 64 pixels Total Entropy = 1.62 bits/pixel Qstep = 15

Multilevel Haar Xform Qstep = 15

Measuring Performance • Compression Efficiency - Entropy • Reconstruction Quality – Subjective Analysis Haar Transform Quantisation Quantisation

Reconstruction Qstep = 15

Reconstruction Qstep = 30

Reconstruction Qstep = 30 Original Quantised Haar Transform + Quantisation

Laplacian Pdfs We assume that the histograms are derived from a continuous Laplacian PDF quantised along the intensity (x) axis. This will give us a theoretical expression for entropy wrt to the step size and standard deviation of the image.

GOAL – estimate a theoretical value for the entropy of one of the subbands So we can estimate x0 for the band by finding the standard deviation of the coefficient values.

x1 = 0, x2 = Q/2 x1 = (k-1/2)Q, x2 = (k-1/2)Q

See Handout for Missing Steps Here

Measured Entropy is less than what we would expect for a laplacian pdf. This is because the actual decay of the histogram is greater than an exponential decay.

Practical Entropy Coding

Huffman Coding

Practical Results

The code is inefficient because level 0 as a probability >>0.5 (0.8 approx) Remember the ideal codelength So if pk = 0.8, then However, the minimum code length we can use for a symbol is 1 bit. Therefore, we need to find a new way of coding level 0 – use run length coding

RLC

RLC coding to create “events” 13 -5 1 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 Define max run of zeros as 8, and we are coding runs of 1, 2, 4 and 8 zeros Here we have 4 non-zero “events” 1 x Run-of-4-Zeros event 2 x Run-of-2 zeros event 1x Run-of-8-zeros event 1 x Run-of-1-zero event

Practical Results

Synchronisation Say we have a source with symbols A, B and C. Say we wish to encode the message ABBCCBCABAA using the following code table The Coded message is therefore 010101111101101000 Q. What is the decoded message if the 6th bit in the stream is corrupted? Ie. We receive 010100111101101000

Synchronisation • 010100111101101000 • The decoded stream is ABBACCACABA • The problem is that 1 bit error causes subsequent symbols to be decoded incorrectly as well. • The stream is said to have lost synchronisation. • A solution is to periodically insert synchronisation symbols into the stream (eg. One at the start of each row). This limits how far errors can propagate.

Summary

Image Compression, Transform Coding & the Haar Transform