15-211 Fundamental Structures of Computer Science

15-211Fundamental Structuresof Computer Science Lossy Compression February 18, 2003 Ananda Guna

Announcements • Homework #4 is available • Due on Monday, March 17, 11:59pm • Get started now! • Quiz #2 • Available on Tuesday, Feb.25 • Some questions will be easier if you have some parts of HW4 working • Read • Chapter 8

HW4 is out!

Lossy Compression • Lecture outline • The concept • The formats • Techniques • Discrete Cosine Transform • Wavelets etc.. • An Example • Lossy Compression using Singular Value Decomposition (SVD) • Why SVD works and its drawbacks

Data Compression We have studied two important data compression algorithms • Huffman Code • Lemple-Ziv Dictionary Method. • They provide good introduction to lossless compression. • What if we can compress the image by degrading the image a bit? • What techniques are used in jpeg and gif compression algorithms? • Lossy compression methods use mathematical theories

The Concept • A Data compression technique where some amount of information is lost (mainly redundant or unnecessary information) • MPEG (Moving Picture Experts Group) • Store only the changes from one frame to another (rather than the frame itself) • Video information is encoded using DCT • MPEG Formats • MPEG-1 – 30 fps – VHS quality • MPEG-2 – 60 fps – higher resolution and CD quality audio • MPEG-4 – based on wavelet technology – smaller files

Techniques • Discrete Cosine Transform (DCT) • Used in JPEG algorithm • Wavelet based image compression • Used in MPEG-4

JPEG

JPEG • Joint Photographic Expert Group • Voted as international standard in 1992 • Works well for both color and grayscale images • Many steps in the algorithm • Some requiring sophistication in mathematics • We’ll skip many parts and focus on just the main elements of JPEG

B G R Y I Q JPEG in a nutshell for each plane (scan) RGB to YIQ (optional) DPCM Zig-zag Quant DCT for each 8x8 block RLE 11010001… Huffman

Linear transform coding • For video, audio, or images, one key first step of the compression will be to encode values over regions of time or space • The basic strategy is to select a set of linear basis functions i that span the space • sin, cos, wavelets, … • defined at discrete points

Linear transform coding • Coefficients: • In matrix notation: • Where A is an nxn matrix, and each row defines a basis function

Cosine transform

Discrete Cosine Transform • DCT separates the image into spectral sub-bands of differing importance • With input image A, the output coefficients B are given by the following equation: N1 and N2 give the image’s height and width

Basis functions

Quantization • The purpose of quantization is to encode an entire region of values into a single value • For example, can simply delete low-order bits: • 101101 could be encoded as 1011 or 101 • When dividing by power-of-two, this amounts to deleting whole bits • Other division constants give finer control over bit loss • JPEG uses a standard quantization table

JPEG quantization table q = Each B(k1,k2) is divided by q(k1,k2). Eye is most sensitive to low frequencies (upper-left).

Zig-zag scan • Purpose is to convert 8x8 block into a 1x64 vector, with low-frequency coefficients at the front

Final stages • The DPCM (differential pulse code modulation) and RLE (run length encoding) steps take advantage of a common characteristic of many images: • An 8x8 block is often not too different than the previous one • Within a block, there are often long sequences of zeros

Data Compression with SVD

Singular Value Decomposition(SVD) • Suppose A is an mxn matrix • We can find a decomposition of the matrix A such that A = U S VT, where • U and V are orthonormal matrices (I.e. UUT = I and V VT = I, where I-identity matrix • S is a diagonal matrix such that S = diag(s1, s2, s3, … sk, 0,0,…0), where si ‘s are called the singular values of A and k is the rank of A. It is possible to choose U and V such that s1> s1> …. > sk • More on SVD in your linear algebra course..

Here is another way of expressing A • A = s1 U1V1T + s2 U2V2T + ….+ sK UKVKT where Ui and Vi are ith column of U and V respectively • Bit of a knowledge about block matrix multiplication will convince you that this sum is indeed equal to A. • So How does this applies to image compression? It is very very interesting • Any image is really an mxn matrix of pixels and in a bitmap color image each pixel is represented by 3-bytes (R,G,B)

Here is a look at it

Consider this color image This is part of a famous image (Do you know who? Hint: Splay) The image is a 16x16 bitmap image enlarged

Here is the Red part of the image

Green Part

Blue Part

The Red matrix representation of the image (16x16 matrix) 173 165 165 165 148 132 123 132 140 156 173 181 181 181 189 173 198 189 189 189 181 165 148 165 165 173 181 198 206 198 181 165 206 206 206 206 198 189 181 181 198 206 206 222 231 214 181 165 231 222 206 198 189 181 181 181 206 222 222 222 231 222 198 181 231 214 189 173 165 165 173 181 181 189 198 222 239 231 206 214 206 189 173 148 148 148 148 165 156 148 165 198 222 231 214 239 181 165 140 123 123 115 115 123 140 148 140 148 165 206 239 247 165 82 66 82 90 82 90 107 123 123 115 132 140 165 198 231 123 198 74 49 57 82 82 99 107 115 115 123 132 132 148 214 239 239 107 82 82 74 90 107 123 115 115 123 115 115 123 198 255 90 74 74 99 74 115 123 132 123 123 115 115 140 165 189 247 99 99 82 90 107 123 123 123 123 123 132 140 156 181 198 247 239 165 132 107 148 140 132 132 123 132 148 140 140 156 214 198 231 165 156 132 156 156 140 140 140 148 148 132 140 156 222 247 239 222 181 181 140 156 140 148 148 148 140 132 156 206 222 214 198 181 181 181 181 173 148 156 148 140 140 165 198 222 239 • So the idea is to apply SVD to this matrix and get a close enough approximation using as fewer columns of U and V as possible. • So for example, if the above matrix has one large dominant singular value, we might be able to get a pretty good approximation using just a single vector each from U and V

173 165 165 165 148 132 123 132 140 156 173 181 181 181 189 173 198 189 189 189 181 165 148 165 165 173 181 198 206 198 181 165 206 206 206 206 198 189 181 181 198 206 206 222 231 214 181 165 231 222 206 198 189 181 181 181 206 222 222 222 231 222 198 181 231 214 189 173 165 165 173 181 181 189 198 222 239 231 206 214 206 189 173 148 148 148 148 165 156 148 165 198 222 231 214 239 181 165 140 123 123 115 115 123 140 148 140 148 165 206 239 247 165 82 66 82 90 82 90 107 123 123 115 132 140 165 198 231 123 198 74 49 57 82 82 99 107 115 115 123 132 132 148 214 239 239 107 82 82 74 90 107 123 115 115 123 115 115 123 198 255 90 74 74 99 74 115 123 132 123 123 115 115 140 165 189 247 99 99 82 90 107 123 123 123 123 123 132 140 156 181 198 247 239 165 132 107 148 140 132 132 123 132 148 140 140 156 214 198 231 165 156 132 156 156 140 140 140 148 148 132 140 156 222 247 239 222 181 181 140 156 140 148 148 148 140 132 156 206 222 214 198 181 181 181 181 173 148 156 148 140 140 165 198 222 239 The red image, again Byte values (0…255) indicate intensity of the color at each pixel

Matrix decomposition • Suppose A is an mn matrix, e.g.: • We can decompose A into three matrices, U, S, and V, such that A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150 A = USVT

Orthonormal: UUT = I Diagonal, with decreasing singular values Orthonormal: VVT = I Decomposition example A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150 U = 0.5709 -0.6772 -0.4532 0.1009 0.0516 -0.0005 -0.1539 -0.9867 0.3500 0.7121 -0.5984 0.1113 0.7409 0.1854 0.6425 -0.0615 S = 386.154 0 0 0 0 20.6541 0 0 0 0 7.5842 0 0 0 0 0.9919 V = 0.5209 -0.5194 0.6004 -0.3137 0.4338 -0.1330 -0.7461 -0.4873 0.5300 -0.1746 -0.1886 0.8081 0.5095 0.8259 0.2176 -0.1049

Singular value decomposition • Such a factoring of a matrix, or decomposition is a called an SVD. • Exactly how to find U, V, and S is beyond the scope of this course. • But you’ll find out in your matrix/linear algebra course… • Note: Very important also for graphics/animation algorithms

So what about compression? • Let: • si be the ith eigen value in S • Ui be the ith column in U • Vi be the ith column in V • Then, another formula for matrix A is A = s1 U1V1T + s2 U2V2T + ….+ sK UKVKT

U1 A1 = s1U1V1T s1 V1 SVD example A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150 U = 0.5709 -0.6772 -0.4532 0.1009 0.0516 -0.0005 -0.1539 -0.9867 0.3500 0.7121 -0.5984 0.1113 0.7409 0.1854 0.6425 -0.0615 S = 386.154 0 0 0 0 20.6541 0 0 0 0 7.5842 0 0 0 0 0.9919 V = 0.5209 -0.5194 0.6004 -0.3137 0.4338 -0.1330 -0.7461 -0.4873 0.5300 -0.1746 -0.1886 0.8081 0.5095 0.8259 0.2176 -0.1049 = 115 96 117 112 10 9 11 10 70 59 72 69 149 124 152 146 This is called the “rank-1 approximation

Apply SVD to a simple matrix • Lets take a look at applying the SVD to a smaller matrix • This example will allow us to understand what is going on here A = 120 100 120 100 10 10 10 10 60 60 70 80 150 120 150 150 U = 0.5709 -0.6772 -0.4532 0.1009 0.0516 -0.0005 -0.1539 -0.9867 0.3500 0.7121 -0.5984 0.1113 0.7409 0.1854 0.6425 -0.0615 S = 386.1540 0 0 0 0 20.6541 0 0 0 0 7.5842 0 0 0 0 0.9919 V = 0.5209 -0.5194 0.6004 -0.3137 0.4338 -0.1330 -0.7461 -0.4873 0.5300 -0.1746 -0.1886 0.8081 0.5095 0.8259 0.2176 -0.1049 Note that first eigen value is substantially larger than the others.

Form a rank-1 sum A1 = s1 U1 V1T • A1 = 115 96 117 112 10 9 11 10 70 59 72 69 149 124 152 146 • Error Matrix |A - A1| is 5 4 3 12 0 1 1 0 10 1 2 11 1 4 2 4 Error is relatively small with a rank-1 approximation.

What have we learnt here? • Perhaps in this case we only need to know one column vector, one row vector, one singular value to get a pretty good approximation to the original image. • So instead of 4x4 = 16 bytes, we can store = 4 + 4 + 1 bytes to get an image fairly “close” to original image (almost 50% savings)

What if we do a rank-2 approximation? A2 = s1 U1 V1T + s2 U2 V2T A2 = 122 98 119 100 10 9 11 10 62 57 69 81 147 123 151 149 • Error Matrix |A - A2| 2 2 1 0 0 1 1 0 2 3 1 1 3 3 1 1 • Even smaller error matrix

Analysis To get an idea of how close the approximation to the original matrix is, we can calculate: • Mean of Rank1 error matrix = 3.8125 • Mean of Rank2 error matrix = 1.3750 Where mean is the average of the all entries • We really don’t gain much by calculating the rank-2 approximation (why?)

Why is that? • If you look at the first singular value it is fairly large compared to others. • Therefore, contribution from the rank-1 sum is very significant compared to the sum of all other rank approximations. • So even if you leave out all other rank sums, you still get a pretty good approximation with just two vectors. So we are up to something here • It is time to look at some samples

Some Samples (128x128) Original mage 49K Rank 1 approx 825 bytes

Samples ctd… Rank 8 approx 7K Rank 16 approx 13K

Some size observations • Note that theoretically the sizes of the compressed images should be • Rank 1 = 54 + (128 + 128 + 1)*3 • Rank 8 = 54 + (128+128+1)*3*8 = 6K • Rank 16 = 54 + (128 + 128 + 1)*3*16 = 12K • Rank 32 = 54 + (128 + 128 + 1)*3*32 = 24K • Rank 64 = 48K (pretty close to the original) Bmp Header U1 V1 + S1bytes/pixel

Implementation (compression) 001100100 000110000 1100000100 SVD Compressed file stores U, V and S for the Rank selected for each of the colors R , G and B and header bytes SVD SVD Header bytes COMPRESSION STEP

Implementation (decompression) 001100100 000110000 1100000100 Form rank sum Form rank sum Form rank sum Header bytes Compressed file stores U, V and S for the Rank selected for each of the colors R , G and B and the bmp header DECOMPRESSION

Matlab Code for SVD • Matlab is a computer algebra system (www.mathworks.com) • Here is Matlab code that can perform SVD on an image. • A=imread('c:\temp\rhino64','bmp'); • N = size(A)[1]; • R = A(:,:,1); // extract Red matrix • G = A(:,:,2); // extract Green Matrix • B = A(:,:,3); // extract blue matrix • Apply SVD to each of the matrices • [ur,sr,vr]=svd(double(R)); • [ug,sg,vg]=svd(double(G)); • [ub,sb,vb]=svd(double(B));

15-211 Fundamental Structures of Computer Science