300 likes | 451 Views
010.141 Engineering Mathematics II Lecture 16 Compression. Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University. Outline. Lossless Compression Huffman & Shannon-Fano Arithmetic Compression The LZ Family of Algorithms Lossy Compression
E N D
010.141 Engineering Mathematics IILecture 16Compression Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University
Outline • Lossless Compression • Huffman & Shannon-Fano • Arithmetic Compression • The LZ Family of Algorithms • Lossy Compression • Fourier compression • Wavelet Compression • Fractal Compression
Lossless Compression • Lossless encoding methods guarantee to reproduce exactly the same data as was input to them
Relative Encoding • Useful when there are sequences of runs of data that vary only slightly from one run to the next: • eg the lines of a fax • The position of each change is denoted relative to the start of the line • Position indicator can be followed by a numeric count indicating the number of successive changes • For further compression, the position of the next change can be denoted relative to the previous
Statistical Compression • For the examples below, we will use a simple alphabet with the following frequencies of occurrence (after Held)
Huffman Encoding • Arrange the character set in order of decreasing probability • While there is more than one probability class: • Merge the two lowest probability classes and add their probabilities to obtain a composite probability • At each branch of the binary tree, allocate a '0' to one branch and a '1' to the other • The code for each character is found by traversing the tree from the root node to that character
Shannon-Fano Algorithm • Arrange the character set in order of decreasing probability • While a probability class contains more than one symbol: • Divide the probability class in two • so that the probabilities in the two halves are as nearly as possible equal • Assign a '1' to the first probability class, and a '0' to the second
Arithmetic Coding • Arithmetic coding assumes there is a model for statistically predicting the next character of the string to be encoded • An order-0 model predicts the next symbol based on its probability, independent of previous characters • For example, an order-0 model of English predicts the highest probability for ‘e’ • An order-1 model predicts the next symbol based on the preceding character • For example, if the preceding character is ‘q’, then ‘u’ is a likely next character • And so on for higher order models • ‘ert’ ‘erty’, etc.
Arithmetic Coding • Arithmetic coding assumes the coder and decoder share the probability table • The main data structure of arithmetic coding is an interval, representing the string constructed so far • Its initial value is [0,1] • At each stage, the current interval [min,max] is subdivided into sub-intervals corresponding to the probability model for the next character • The interval chosen will be the one representing the actual next character • The more probable the character, the larger the interval • The coder output is a number in the final interval
Arithmetic Coding • Suppose we want to encode the string X1X3X7 • After X1, our interval is [0,0.1] • After X3, it is [0.015,0.035] • After X7, it is [0.033,0.035] • The natural output to choose is the shortest binary fraction in [0.033,0.035] • Obviously, the algorithm as stated requires infinite precision • Slight variants re-normalise at each stage to remain within computer precision
Substitutional Compression • The basic idea behind a substitutional compressor is to replace an occurrence of a particular phrase with a reference to a previous occurrence • There are two main classes of schemes • Named after Jakob Ziv and Abraham Lempel, who first proposed them in 1977 and 1978
LZW • LZW is an LZ78-based scheme designed by T Welch in 1984 • LZ78 schemes work by putting phrases into a dictionary • when a repeat occurrence of a particular phrase is found, outputting the dictionary index instead of the phrase • LZW starts with a 4K dictionary • entries 0-255 refer to individual bytes • entries 256-4095 refer to substrings • Each time a new code is generated it means a new string has been parsed • New strings are generated by adding current character K to the end of an existing string w (until dictionary is full)
LZW Algorithm set w = NILloop read a character K if wK exists in the dictionary w = wK else output the code for w add wK to the string table w = K endloop
LZW The most remarkable feature of this type of compression is that the entire dictionary has been transmitted to the decoder without actually explicitly transmitting the dictionary • At the end of the run, the decoder will have a dictionary identical to the one the encoder has, built up entirely as part of the decoding process • Codings in this family are behind such representations as .gif • They were previously under patent, but many of the relevant patents are now expiring
Lossy Compression • Lossy compression algorithms do not guarantee to reproduce the original input • They achieve much higher compression by limiting their compression to what is ‘near enough’ to be acceptably detectable • Usually, this means detectable by a human sense - sight (jpeg), hearing (mp3), motion understanding (mp4) • This requires a model of what is acceptable • The model may only be accurate in some circumstances • Which is why compressing a text or line drawing with jpeg is a bad idea
Fourier Compression (jpeg) • The Fourier transform of a dataset is a frequency representation of that dataset • You have probably already seen graphs of Fourier transforms • the frequency diagram of a sound sample is a graph representation of the Fourier transform of the original data, which you see graphed as the original time/amplitude diagram
Fourier Compression • From our point of view, the important features of the Fourier transform are: • it is invertible • original dataset can be rebuilt from the Fourier transform • graphic images of the World usually contain spatially repetitive information patterns • Human senses are (usually) poor at detecting low-amplitude visual frequencies • The Fourier transform usually has information concentrated at particular frequencies, depleted at others • The depleted frequencies can be transmitted at low precision without serious loss of overall information.
Discrete Cosine Transform • A discretised version of the Fourier transform • Suited to representing spatially quantised (ie raster) images in a frequency quantised (ie tabular) format. • Mathematically, the DCT of a function f ranging over a discrete variable x (omitting various important constants) is given by • F(n) = Σx f(x) cos(nπx) • Of course, we’re usually interested in two-dimensional images, and hence need the two-dimensional DCT, given (omitting even more important constants) by • F(m,n) = Σx Σyf(x,y) cos(mπx) cos(nπy)
Fourier Compression Revisited • Fourier-related transforms are based on sine (or cosine) functions of various frequencies • The transform is a record of how to add together the periodic functions to obtain the original function • Really, all we need is a basis set of functions • A set of functions that can generate all others
The Haar Transform • Instead of periodic functions, we could instead add together discrete functions such as: +--+ +------+ + | +------------------ + | +-------------- +--+ +------+ +--+ +------+ ------+ | +------------ --------------+ | + +--+ +------+ +--+ +-------------+ ------------+ | +------ + | + +--+ +-------------+ +--+ +---------------------------+ ------------------+ | + + + +--+ • This would give us the Haar transform • It can also be used to compress image data, though not as efficiently as the DCT • images compressed at the same rate as the DCT tend to look ‘blocky’, so higher compression is required to give the same impression
Wavelet Compression • Wavelet compression uses a basis set intermediate between Fourier and Haar transforms • The functions are ‘smoothed’ versions of the Haar functions • They have a sinusoidal rather than square shape • They don’t die out abruptly at the edges • They decay into lower amplitude • Wavelet compression can give very high ratios • attributed to similarities between wavelet functions and the edge detection present in the human retina • wavelet functions encode just the detail that we see best
Vector Quantisation • Relies on building a codebook of similar image portions • Only one copy of the similar portions is transmitted • Just as LZ compression relies on building a dictionary of strings seen so far • just transmitting references to the dictionary
Fractal Compression • Rely on self-similarity of (parts of) the image to reduce transmission • It has a similar relation to vector quantisation methods as LZW has to LZ • LZW can be thought of as LZ in which the dictionary is derived from the part of the text seen so far • fractal compression can be viewed as deriving its dictionary from the portion of the image seen so far
Compression Times • For transform encodings such as DCT or wavelet • compression and decompression times are roughly comparable • For fractal compression • Compression takes orders of magnitude longer than decompression • Difficult to find the right codebook • Fractal compression is well suited where pre-canned images will be accessed many times over
Summary • Lossless Compression • Huffman & Shannon-Fano • Arithmetic Compression • The LZ Family of Algorithms • Lossy Compression • Fourier compression • Wavelet Compression • Fractal Compression