690 likes | 870 Views
United International University. Department of Computer Science and Engineering. Presente d By. Presenters . Presentation On. Lossless Data Compression Huffman And Shannon- Fano. Data compression.
E N D
United International University Department of Computer Science and Engineering
Presentation On Lossless Data CompressionHuffman And Shannon-Fano
Data compression In computer science data compression is the technique of reducing the number of binary digits required to represent data.
Why Data Compression?? • To save the memory space acquired by the files • To reduce cost by using minimal memory storage devices specially for databases. • To handle data more effectively. • For faster data transfer. • Amazingly effective for web services.
Data compression Types of data compression Lossless Lossy
Lossy Data Compression “Lossy" compression is a data encoding method that compresses data by losing some of it.
Lossless Data Compression Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data.
Lossless ShanNon-fano Huffman
In our today’s presentation we are going to focus on Lossless Compression Method
Claude Elwood Shannon (1916-2001) • Born : April 30, 1916 Petoskey,Michigan,United States. • Died : February 24, 2001 (aged 84) Medford, Massachusetts,United States. • Residence : United States • Nationality : American • Fields : Mathmetics and electronic engineering. • Institutions : Bell Laboratories Massachusetts Institute of Technology Institute for Advanced Study. • Known for : Information Theory,Shannon–Fanocoding,Shannon– Hartleylaw,Nyquist Shannon sampling theorem,Noisy channel coding theorem,Shannonswitchingame,Shannon number,Shannonindex,Shannon's source coding theorem, Shannon's expansion,Shannon-Weaver model of communication,Whittaker–Shannon interpolation formula. • Notable awards: IEEE Medal of Honor,KyotoPrize,Harvey Prize (1972).
Robert Mario Fano • Born : 1917 (age 94–95)Turin, Italy • Citizenship : United States • Fields : computer science, information theory • Institutions : Massachusetts Institute of Technology • Known for : Shannon-Fano coding, founder of Project MAC. • Thesis : Theoretical Limitations on the Broadband Matching of Arbitrary Impedances (1947). • Notable awards : Shannon Award, 1976; IEEE Fellow, 1954.
In the field of data compression, Shannon–Fano coding is a technique for constructing a prefix code based on a set of symbols and their probabilities(estimated or measured). It is suboptimal in the sense that it does not achieve the lowest possible expected code word length like Huffman coding.
Main Features • Statistic compression method. • Two-pass compression method. • Semi-adaptive compression method. • Asymetric compression method. • The result compression is worse in comparison with Huffman coding. • Optimality of output is not guaranteed.
We are going to discuss the process of shanon-fano coding in four steps. Frequency Count Sorting Code Generation Code Conversion
0 0 1 1 1
0 0 0 1 1 1 1
0 0 0 1 1 0 1 1 1 1
0 0 0 1 1 0 1 1 0 1 1 1
Code Replacement Source.txt Code.txt
Code Conversion Compress.txt Code.txt
Algorithm of Shannon-Fano Coding • For a given list of symbols, develop a corresponding list of probabilities or frequency counts so that each symbol’s relative frequency of occurrence is known. • Sort the lists of symbols according to frequency, with the most frequently occurring symbols at the left and the least common at the right. • Divide the list into two parts, with the total frequency counts of the left part being as close to the total of the right as possible. • The left part of the list is assigned the binary digit 0, and the right part is assigned the digit 1. This means that the codes for the symbols in the first part will all start with 0, and the codes in the second part will all start with 1. • Recursively apply the steps 3 and 4 to each of the two halves, subdividing groups and adding bits to the codes until each symbol has become a corresponding code leaf on the tree.
David Albert Huffman (August 9, 1925 – October 7, 1999)
Awards and Honors 1955 The Louis E.Levy Medal 1973 TheW.Wallace McDowell Award 1981 Charter recipient of the Computer Pioneer Award 1998 A Golden Jubilee Award 1999 TheIEEE Richard W.Hamming Medal
E N E E R I N G N G I Mechanism of Huffman Coding E 3 N 3 Frequency Count CharacterCount G 2 I 2 R 1
HUFFMAN TREE CONSTRUCTION E:3 N:3 G:2 I:2 R:1
R:1 G:2 I:2 E:3 N:3 :3
I:2 E:3 N:3 :3 :5 G:2 R:1
E:3 N:3 :5 :6 :3 I:2 R:1 G:2
:11 :5 :6 :3 I:2 E:3 N:3 R:1 G:2
CODE GENERATION 0 1 0 1 0 1 0 1 E N I R G I 0 0 0 1 0 R N 1 1 0 1 1 E 1 0 G
CODE REPLACEMENT E 10 N01 I00 R 010 ENGINEERING G011 10 01 011 00 01 10 10 010 00 01 011
CODE CONVERTION 1001011000110100100001011 10010110 00110100 10000101 Decimal value of this binary value ASCII Character 1 = 150 133 1 52 NULL 4
Algorithm of Huffman Coding • Create a leaf node for each symbol and add it to frequency of occurrence. • While there is more than one node in the queue: • Remove the two nodes of lowest probability or frequency from the queue • Prepend 0 and 1 respectively to any code already assigned to these nodes • Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. • Add the new node to the queue. • The remaining node is the root node and the tree is complete.
Our First Test Case “I failed in some subjects in exam, but my friend passed in all. Now he is an engineer in Microsoft and I am the owner of Microsoft.” Most inspiring words from Bill Gates !!! ;) Let’s see what happen if we apply huffman and shannonfano .