Lossless Decomposition and Huffman Codes

1. Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B

2. Lossless Data Compression Any compression algorithm can be viewed as a function that maps sequences of units into other sequences of units. The original data to be reconstructed from the compressed data. - Lossless Lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed in exchange for better compression rates.

3. David A. Huffman BS Electrical Engineering at Ohio State University Worked as a radar maintenance officer for the US Navy PhD student, Electrical Engineering at MIT 1952 Was given the choice of writing a term paper or to take a final exam Paper topic: most efficient method for representing numbers, letters or other symbols as binary code

4. Huffman Coding Uses the minimum number of bits Variable length coding � good for data transfer Different symbols have different lengths Symbols with the most frequency will result in shorter codewords Symbols with lower frequency will have longer codewords �Z� will have a longer code representation then �E� if looking at the frequency of character occurrences in an alphabet No codeword is a prefix for another codeword!

5. Decoding

6. Decoding 11

7. Representing a Huffman Table as a Binary Tree Codewords are presented by a binary tree Each leaf stores a character Each node has two children Left = 0 Right = 1 The codeword is the path from the root to the leaf storing a given character The code is represented by the leads of the tree is the prefix code

8. Constructing Huffman Codes Goal: construct a prefix code for S: associate each letter i with a codeword wi to minimize the average codeword length:

9. Example

10. Algorithm Make a leaf node for node symbol Add the generation probability for each symbol to the leaf node Take the two leaf nodes with the smallest probability (pi) and connect them into a new node (which becomes the parent of those nodes) Add 1 for the right edge Add 0 for the left edge The probability of the new node is the sum of the probabilities of the two connecting nodes If there is only one node left, the code construction is completed. If not, to back to (2)

11. Example

12. Example � Creating the tree

13. Example � Iterate Step 2 Take the two leaf nodes with the smallest probability (pi) and connect them into a new node (which becomes the parent of those nodes) Green nodes � nodes to be evaluated White nodes � nodes which have already been evaluated Blue nodes � nodes which are added in this iteration

14. Example � Iterate Step 2 Note: when two nodes are connected by a parent, the parent should be evaluated in the next iteration

15. Example � Iterate Step 2

16. Example: Completed Tree

17. Example: Table for Huffman Code

18. Practice

19. Practice Solution

20. Questions?

21. References http://www.cstutoringcenter.com/tutorials/algorithms/huffman.php http://en.wikipedia.org/wiki/Huffman_coding http://michael.dipperstein.com/huffman/index.html http://en.wikipedia.org/wiki/David_A._Huffman http://www.binaryessence.com/dct/en000080.htm

Lossless Decomposition and Huffman Codes

Lossless Decomposition and Huffman Codes

Presentation Transcript

Lossless Decomposition

Lossless Decomposition

Lossless Decomposition (2)

Huffman Codes

Huffman Codes

Huffman Codes

Huffman Codes

Huffman Codes

Huffman codes

LOSSLESS DECOMPOSITION

Lossless Decomposition

Lossless Decomposition& 4NF

Huffman Codes

BCNF & Lossless Decomposition

Huffman Codes

4.8 Huffman Codes

Huffman Codes

4.8 Huffman Codes

Huffman Codes

Lossless Decomposition

Lossless Decomposition

Huffman Codes

Lossless Decomposition and Huffman Codes