530 likes | 842 Views
Image Compression, Transform Coding & the Haar Transform. 4c8 – Dr. David Corrigan. Entropy. It all starts with entropy. Calculating the Entropy of an Image. The entropy of lena is = 7.57 bits/pixel approx.
E N D
Image Compression, Transform Coding & the Haar Transform 4c8 – Dr. David Corrigan
Entropy • It all starts with entropy
Calculating the Entropy of an Image The entropy of lena is = 7.57 bits/pixel approx The maximum the entropy could be is 8 bits, so it doesn’t look like much compression would be possible.
Huffman Coding • Huffman is the simplest entropy coding scheme • It achieves average code lengths no more than 1 bit/symbol of the entropy • A binary tree is built by combining the two symbols with lowest probability into a dummy node • The code length for each symbol is the number of branches between the root and respective leaf
Huffman Coding of Lenna Average Code Word Length = So the code length is not much greater than the entropy
But this is not very good • Why? • Entropy is not the minimum average codeword length for a source with memory • If the other pixel values are known we can predict the unknown pixel with much greater certainty and hence the effective (ie. conditional) entropy is much less. • Entropy Rate • The minimum average codeword length for any source. • It is defined as
Coding Sources with Memory • It is very difficult to achieve codeword lengths close to the entropy rate • In fact it is difficult to calculate the entropy rate itself • You looked at LZW in 3C5 as a practical coding algorithm • Average codeword length tends to the entropy rate if the file is large enough • Efficiency is improved if we use Huffman to encode the output of LZW • LZ algorithms used in lossless compression formats (eg. .tiff, .png, .gif, .zip, .gz, .rar… )
Efficiency of Lossless Compression • Lenna (256x256) file sizes • Uncompressed tiff - 64.2 kB • LZW tiff – 69.0 kB • Deflate (LZ77 + Huff) – 58 kB • Green Screen (1920 x 1080) file sizes • Uncompressed – 5.93 MB • LZW – 4.85 MB • Deflate – 3.7 MB
Differential Coding • Key idea – code the differences in intensity. G(x,y) = I(x,y) – I(x-1,y)
Differential Coding Calculate Difference Image Huffman Enoding Channel Image Recon-struction Huffman Decoding The entropy is now 5.60 bits/pixel which is much less than 7.57 bits/pixel we had before (despite having twice as many symbols)
So why does this work? Plot a graph of H(p) against p.
In general • Entropy of a source is maximised when all signals are equiprobable and is less when a few symbols are much more probable than the others. Entropy = 5.6 bits/pixel Entropy = 7.57 bits/pixel Histogram of the original image Histogram of the difference image
Lossy Compression • But this is still not enough compression • Trick is to throw away data that has the least perceptual significance Effective bit rate = 8 bits/pixel Effective bit rate = 1 bit/pixel (approx)
Lossy Transform Coding Lossless Lossy Lossless Lossless
The Haar Xform LoLo Hi-Lo Lo-Hi Hi-Hi
Implementation Details • When displaying the haar transform for the image the mid gray value represents 0 (except for the Lo-Lo Band). • Colour Images are processed by treating each colour channel as separate gray scale images. • If YUV colourspace is used subsampling of the U and V channels is probable. Subsampling is done before the Haar Transform is taken.
Quantisation • After we create the image we quantise the transform coefficients. • Step size is shown by perceptual evaluation • We can assign different step sizes to the different bands. • We can use different step sizes for the different colour channels. • We will consider a uniform step size, Qstep, for each band for now.
Entropy Qstep = 15
Entropy Qstep = 15 • Calculating the overall entropy is trickier • Each coefficient in a band represents 4 pixel locations in the original image. • So bits/pixel = (bits/coefficient)/4 • So the entropy of the transformed and quantised lenna is
Mistake in Fig. 5 of handout Red Dashed Line is the Histogram. Blue bars represent the “entropies” (ie. - p * log2(p) ) and not vice versa
Calculating the Entropy for Level 2 of the transform • One Level 1 coefficient represents 4 pixels • One level 2 coefficient represents 16 pixels Total Entropy = 1.70 bits/pixel Qstep = 15
Calculating the Entropy for Level 3 of the transform • One Level 1 coefficient represents 4 pixels • One level 2 coefficient represents 16 pixels • One level 3 coefficient represents 64 pixels Qstep = 15
Calculating the Entropy for Level 3 of the transform • One Level 1 coefficient represents 4 pixels • One level 2 coefficient represents 16 pixels • One level 3 coefficient represents 64 pixels Total Entropy = 1.62 bits/pixel Qstep = 15
Multilevel Haar Xform Qstep = 15
Measuring Performance • Compression Efficiency - Entropy • Reconstruction Quality – Subjective Analysis Haar Transform Quantisation Quantisation
Reconstruction Qstep = 30 Original Quantised Haar Transform + Quantisation
Laplacian Pdfs We assume that the histograms are derived from a continuous Laplacian PDF quantised along the intensity (x) axis. This will give us a theoretical expression for entropy wrt to the step size and standard deviation of the image.
GOAL – estimate a theoretical value for the entropy of one of the subbands So we can estimate x0 for the band by finding the standard deviation of the coefficient values.
x1 = 0, x2 = Q/2 x1 = (k-1/2)Q, x2 = (k-1/2)Q
Measured Entropy is less than what we would expect for a laplacian pdf. This is because the actual decay of the histogram is greater than an exponential decay.
The code is inefficient because level 0 as a probability >>0.5 (0.8 approx) Remember the ideal codelength So if pk = 0.8, then However, the minimum code length we can use for a symbol is 1 bit. Therefore, we need to find a new way of coding level 0 – use run length coding
RLC coding to create “events” 13 -5 1 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 Define max run of zeros as 8, and we are coding runs of 1, 2, 4 and 8 zeros Here we have 4 non-zero “events” 1 x Run-of-4-Zeros event 2 x Run-of-2 zeros event 1x Run-of-8-zeros event 1 x Run-of-1-zero event
Synchronisation Say we have a source with symbols A, B and C. Say we wish to encode the message ABBCCBCABAA using the following code table The Coded message is therefore 010101111101101000 Q. What is the decoded message if the 6th bit in the stream is corrupted? Ie. We receive 010100111101101000
Synchronisation • 010100111101101000 • The decoded stream is ABBACCACABA • The problem is that 1 bit error causes subsequent symbols to be decoded incorrectly as well. • The stream is said to have lost synchronisation. • A solution is to periodically insert synchronisation symbols into the stream (eg. One at the start of each row). This limits how far errors can propagate.