190 likes | 404 Views
Introduction. Huffman codes, arithmetic codesassume a sequence of independent symbolsDictionary methodsidentify frequent (and infrequent) occurring patternsencode with different methods. Example. A32 = {26 lowercase letters , . ! ? ; :}Uncompressed case4-symbol block => 5*4 = 20 bits per bloc
E N D
1. Dictionary Techniques Mei-Chen Yeh
2. Introduction Huffman codes, arithmetic codes
assume a sequence of independent symbols
Dictionary methods
identify frequent (and infrequent) occurring patterns
encode with different methods
3. Example A32 = {26 lowercase letters , . ! ? ; :}
Uncompressed case
4-symbol block => 5*4 = 20 bits per block
Dictionary method: select 256 most frequent patterns
9p + 21(1-p) = 21 12p
p: P (encounter a pattern from the dictionary)
21 12p < 20 => p > 0.084!
4. Static Dictionary Application-specific
Digram Coding
stores letter pairs
Example: A = {a, b, c, d, r}
Encode abracadabra
6. LZ77 (1977)
LZ78 (1978)
LZW: UNIX compress, GIF Adaptive Dictionary
7. LZ77 <o, l, c>
o: the distance of the pointer from the look-ahead buffer
l: length of match
c: codeword for the symbol in the look-ahead buffer that follows the match
The number of bits required to code the triplet is ? Why encode c? in case there is no match in the search buffer!Why encode c? in case there is no match in the search buffer!
8. LZ77: Encode
9. LZ77: Decode
10. LZ78 Worst-case situation for LZ77
No search buffer, instead, build a dictionary: <o, l> ? i
<i, c>
i: index in the dictionary
c: codeword for the symbol that follows the matched portion
11. LZ78: Example Monster song from the sesame streetMonster song from the sesame street
12. In case the dictionary is full
Freeze
Delete the least used items
Progressively doubled
Erase the dictionary (reset)
13. Variation on LZ78: LZW (Encode)
14. LZW (Decode)
15. Applications (1) UNIX compress command
Based on LZW
Adaptive dictionary size
512 in the beginning (9 bits for transmitting an index)
Double the size if filled up (512 ? 1024 ? 2048
)
If the maximal size is achieved, flush the dictionary or do nothing (a static dictionary) depending on the compression ratio
16. Applications (2) The Graphics Interchange Format (GIF)
Graphical images
First byte: #bits b per pixel in the image
Example: 8 for grayscale images
Clear code: the binary number 2b
Reset the compression/decompression parameters
Initial diction size: 2b+1
Doubled when filled up, until reaching 4096, and becomes a static dictionary
17. Applications (3)
18. Dennis RitchieSep. 9, 1941 Oct. 12, 2011 The inventor of Unix and C
Received the Turing Award in 1983
Co-wrote the book The C Programming Language
Steve Jobs, who died Oct. 5, 2011
He named his creation C because programming language that came before it was called B.
Ph.D. in Harvard, worked in Bell Lab for over four decadesSteve Jobs, who died Oct. 5, 2011
He named his creation C because programming language that came before it was called B.
Ph.D. in Harvard, worked in Bell Lab for over four decades
19. Dennis RitchieSep. 9, 1941 Oct. 12, 2011 Quotes
C is quirky, flawed, and an enormous success.
UNIX is very simple, it just needs a genius to understand its simplicity.