1 / 9

Data Compression (1)

Data Compression (1). Hai Tao. Data Compression – Why ?. Storing or transmitting multimedia data requires large space or bandwidth

garan
Download Presentation

Data Compression (1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Compression (1) Hai Tao

  2. Data Compression – Why ? • Storing or transmitting multimedia data requires large space or bandwidth • The size of one hour 44K sample/sec 16-bit stereo (two channels) audio is 3600x44000x2x2= 633.6MB, which can be recorded on one CD (650 MB). MP3 compression can reduce this number by factor of 10 • The size of a 500x500 color image is 750KB without compression (JPEG can reduced this by a factor of 10 to 20) • The size of one minute real-time, full size, color video clip is 60x30x640x480x3= 1.659GB. A two-hour movie requires 200GB. MPEG2 compression can bring this number down to 4.7 GB (DVD)

  3. Compression methods

  4. Run-length coding • Example: • A scanline of a binary image is 00000 00000 00000 00000 00010 00000 00000 01000 00000 00000 Total of 50 bits • However, strings of consecutive 0’s or 1’s can be represented more efficiently 0(23) 1(1) 0(12) 1(1) 0(13) If the counts can be represented using 5 bits, then we can reduce the amount of data to 5+5*5=30 bits. A compression ratio of 40%

  5. Huffman coding • Example: 4 letters in language “A” “B” “S” “Z” • To uniquely encode each letter, we need two bits A- 00 B-01 S-10 Z – 11 A message “AAABSAAAAZ” is encoded with 20 bits Now how about assign A- 0 B-100 S-101 Z – 11 The same message can be encoded using 15 bits The basic idea behind Huffman coding algorithm is to assign shorter codewords to more frequently used symbols

  6. Huffman coding – Problem statement • Given a set of N symbols S={si, i=1,…N} with probabilities of occurrence Pi, i=1,…N, find the optimal encoding of the the symbol to achieve the minimum transmission rate (bits/symbol) • Example: Five symbols, A,B,C,D,E with probabilities of P(A)=0.16, P(B)=0.51 P(C)=0.09 P(D)=0.13 P(E)=0.11 Without Huffman coding, 3 bits are needed for each symbol

  7. Huffman Coding - Algorithm • Algorithm • Each symbol is a leave node in a tree • Combining the two symbols or composite symbols with the least probabilities to form a new parent composite symbols, which has the combined probabilities. Assign a bit 0 and 1 to the two links • Continue this process till all symbols merged into one root node. For each symbol, the sequence of the 0s and 1s from the root node to the symbol is the code word • Example

  8. Huffman Coding - Example • Step 1 • Step 2 • Step 3 P(CE)=0.20) 0 1 P(C)=0.09) P(E)=0.11) P(AD)=0.29) 0 1 P(D)=0.13) P(A)=0.16) P(ACDE)=0.49) 1 0 P(CE)=0.20) P(AD)=0.29) 0 0 1 1 P(C)=0.09) P(E)=0.11) P(D)=0.13) P(A)=0.16)

  9. Huffman Coding - Example • Step 4 • Step 5 A=000, B=1, C=011, D=001, E=010 Expected bits/symbol is 3*(0.16+0.09+0.13+0.11)+1*0.51=3*0.49+1*0.51=1.98bit/symbol Compression ratio is 1.02/3=34% P(ABCDE)=1) 0 1 P(B)=0.51) P(ACDE)=0.49) 1 0 P(CE)=0.20) P(AD)=0.29) 0 0 1 1 P(C)=0.09) P(E)=0.11) P(D)=0.13) P(A)=0.16)

More Related