1 / 17

Huffman Encoding

Huffman Encoding. Veronica Morales. Background. Introduced by David Huffman in 1952 Method for encoding data by compression Compressions between 20%-90% Variable-length encoding scheme Used in digital imaging and video. Fixed-length vs. Variable-length Encoding. Fixed-length

velmageorge
Download Presentation

Huffman Encoding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Huffman Encoding Veronica Morales

  2. Background • Introduced by David Huffman in 1952 • Method for encoding data by compression • Compressions between 20%-90% • Variable-length encoding scheme • Used in digital imaging and video

  3. Fixed-length vs. Variable-length Encoding • Fixed-length • Every character code is composed of a “fixed” number of bits, i.e., ASCII code is fixed-length. The ASCII standard uses 7 bits per character • Variable-length • Character code lengths vary. • Huffman encoding uses shorter bit patterns for more common characters, and longer bit patterns for less common characters.

  4. How does it work?The “greedy” approach • Relies on frequency of occurrence (probability) of each character to build up an optimal encoding. • Each character and its frequency is placed on a leaf of a full tree. The two nodes with the smallest frequencies are added and the sum becomes the frequency of the parent node. This process repeats until the root node of the tree is the sum of all leaves.

  5. Encode: Ileana Streinu Create a table with all characters and their probabilities

  6. Make characters leaf-nodes of a tree.

  7. Combine two smallest weights continuously…

  8. …until all nodes are accounted for and we have a main root. The tree is full because every parent has two children. To encode, start from root and as you head down to target letter, use 0 for a left turn and 1 for right turn.

  9. Final tree representation of coding map for “Ileana Streinu”

  10. EXAMPLE 1101110110111001000111 ?

  11. 1101110110111001000111 110 – E 111 – A 011 – T 011 – T 1001 – U 000 – N 111 – A

  12. What’s the benefit? Huffman encoding done with 22 bits 1101110110111001000111 ASCII coding done with 49 bits 1000101 1000001 1010100 1010100 1010101 1001110 1000001 47% savings in space

  13. Complexity • Assume n items • Build a priority Queue (using the Build-Heap procedure) to identify the two least-frequent objects • O (n)

  14. Build the Huffman Tree • Since we have n leaves, we will perform an ‘merging’ operation of two nodes, |n|-1 times and since every heap operation ,i.e., extract the two minimum nodes and then add a node, is O (log n), we have that Huffman’s algorithm is O (n log n)

  15. Encoding using Huffman Tree • Traverse tree from root to leaf is • O (log n)

  16. Real Life Application of Huffman Codes • GNU gzip Data Compression • Internet standard for data compression • Consists of • short header • a number of compressed “blocks” • an 8 byte trailer

  17. Compressed “Blocks” • Three compressed “blocks”: stored, static, dynamic. • Static and Dynamic blocks use an alphabet that is encoded using Huffman Encoding • http://www.daylight.com/meetings/mug2000/Sayle/gzip.html

More Related