140 likes | 389 Views
Abdullah Aldahami (11074595) April 6, 2010. EE800 Term Project Huffman Coding. Huffman Coding Introduction. Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average size. Huffman codes are part of several data formats as ZIP, MPEG and JPEG.
E N D
Abdullah Aldahami (11074595) April 6, 2010 EE800 Term ProjectHuffman Coding
Huffman CodingIntroduction • Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average size. • Huffman codes are part of several data formats as ZIP, MPEG and JPEG. • The code is generated based on the estimated probability of occurrence. • Huffman coding works by creating an optimal binary tree of nodes, that can be stored in a regular array.
Huffman CodingHuffman Coding Technique • The method starts by building a list of all the alphabet symbols in descending order of their probabilities (frequency of appearance). • It then construct a tree from bottom to top. • Step by step, the two symbols with the smallest probabilities are selected; added to the top. • When the tree is completed, the codes of the symbols are assigned.
Huffman Coding Huffman Coding Technique • Example: circuit elements in digital computations • Summation of frequencies (Number of events) is 40
Huffman Coding Huffman Coding Technique • Example: circuit elements in digital computations 0 1 40 0 25 1 13 12 13 0 1 0 1 0 1 7 7 7 8 0 1 0 1 0 1 0 1 4 4 4 4 0 1 0 1 0 1 0 1 2 2 0 1 0 1 ‘ ‘ p n i m d o t a r e u l c s g 3 2 2 1 1 4 1 1 2 6 5 3 3 2 2 2
Huffman Coding Huffman Coding Technique • So, the code will be generated as follows: • Total is 154 bits with Huffman Coding compared to 240 bits with no compression
Huffman Coding Huffman Coding Technique • Entropy is a measure defined in information theory that quantifies the information of an information source. • The measure entropy gives an impression about the success of a data compression process.
Huffman Coding Huffman Coding Technique • The sum of the probability budgets across all symbols is always less than or equal to one. In this example, the sum is equal to one; as a result, the code is termed a complete code. • Huffman coding approaches the optimum on 98.36% = (3.787 / 3.85) *100
Huffman CodingHuffman Coding Variants • Static probability distribution (Static Huffman Coding) • Coding procedures with static Huffman codes operate with a predefined code tree, previously defined for any type of data and is independent from the particular contents. • The primary problem of a static, predefined code tree arises, if the real probability distribution strongly differs from the assumptions. In this case the compression rate decreases drastically.
Huffman CodingHuffman Coding Variants • Adaptive probability distribution (Adaptive Huffman Coding) • The adaptive coding procedure uses a code tree that is permanently adapted to the previously encoded or decoded data. • Starting with an empty tree or a standard distribution. • This variant is characterized by its minimum requirements for header data, but the attainable compression rate is unfavourable at the beginning of the coding or for small files.