• 210 likes • 440 Views
Lossless Data Compression. Any compression algorithm can be viewed as a function that maps sequences of units into other sequences of units.The original data to be reconstructed from the compressed data. - LosslessLossless is in contrast to lossy data compression, which only allows an approximati
E N D
1. Lossless Decomposition and Huffman Codes Sophia Soohoo
CS 157B
2. Lossless Data Compression
Any compression algorithm can be viewed as a function that maps sequences of units into other sequences of units.
The original data to be reconstructed from the compressed data. - Lossless
Lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed in exchange for better compression rates.
3. David A. Huffman BS Electrical Engineering at Ohio State University
Worked as a radar maintenance officer for the US Navy
PhD student, Electrical Engineering at MIT 1952
Was given the choice of writing a term paper or to take a final exam
Paper topic: most efficient method for representing numbers, letters or other symbols as binary code
4. Huffman Coding Uses the minimum number of bits
Variable length coding � good for data transfer
Different symbols have different lengths
Symbols with the most frequency will result in shorter codewords
Symbols with lower frequency will have longer codewords
�Z� will have a longer code representation then �E� if looking at the frequency of character occurrences in an alphabet
No codeword is a prefix for another codeword!
5. Decoding
6. Decoding 11
7. Representing a Huffman Table as a Binary Tree Codewords are presented by a binary tree
Each leaf stores a character
Each node has two children
Left = 0
Right = 1
The codeword is the path from the root to the leaf storing a given character
The code is represented by the leads of the tree is the prefix code
8. Constructing Huffman Codes Goal: construct a prefix code for S: associate each letter i with a codeword wi to minimize the average codeword length:
9. Example
10. Algorithm Make a leaf node for node symbol
Add the generation probability for each symbol to the leaf node
Take the two leaf nodes with the smallest probability (pi) and connect them into a new node (which becomes the parent of those nodes)
Add 1 for the right edge
Add 0 for the left edge
The probability of the new node is the sum of the probabilities of the two connecting nodes
If there is only one node left, the code construction is completed. If not, to back to (2)
11. Example
12. Example � Creating the tree
13. Example � Iterate Step 2 Take the two leaf nodes with the smallest probability (pi) and connect them into a new node (which becomes the parent of those nodes)
Green nodes � nodes to be evaluated
White nodes � nodes which have already been evaluated
Blue nodes � nodes which are added in this iteration
14. Example � Iterate Step 2 Note: when two nodes are connected by a parent, the parent should be evaluated in the next iteration
15. Example � Iterate Step 2
16. Example: Completed Tree
17. Example: Table for Huffman Code
18. Practice
19. Practice Solution
20. Questions?
21. References http://www.cstutoringcenter.com/tutorials/algorithms/huffman.php
http://en.wikipedia.org/wiki/Huffman_coding
http://michael.dipperstein.com/huffman/index.html
http://en.wikipedia.org/wiki/David_A._Huffman
http://www.binaryessence.com/dct/en000080.htm