360 likes | 597 Views
SWE 423: Multimedia Systems. Chapter 7: Data Compression (1). References. Chapter 7 from our Textbook: “Multimedia Fundamentals: Media Coding and Content Processing” Slides from the reference book: Fundamentals of Multimedia . Outline. Introduction Motivation for compression
E N D
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
References • Chapter 7 from our Textbook: “Multimedia Fundamentals: Media Coding and Content Processing” • Slides from the reference book: Fundamentals of Multimedia
Outline • Introduction • Motivation for compression • Coding requirements • Compression types • General Data Compression Scheme • Compression Techniques • Entropy Encoding • Run Length Encoding • Huffman Coding
Introduction • Video and audio have much higher storage requirements than text • Data transmission rates (in terms of bandwidth requirements) for sending continuous media are considerably higher than text • Efficient compression of audio and video data, including some compression standards, will be considered in this chapter
Motivation for Compression • Terminology • 1 kbit = 1000 bit • 1 Kbit = 1024 bit (= 210) • 1 Mbit = 1024 x 1024 bit (= 210 * 210 = 220) • Discrete Data: Considering a small window of 640 x 480 pixels on a display • Text • Vector Image • Bitmap Image • Continuous Data: Required storage space per second • Uncompressed speech of telephone quality • Uncompressed stereo audio signal of CD quality • Video sequence
Motivation for Compression: Discrete Data • Text • Assuming 2 bytes are used for every 8 x 8 pixel character, • Character per screen page = ... • Storage required per screen page = ... • Vector Image • Assuming that a typical image consists of 500 lines, each of which is defined by its coordinates in the x direction and the y direction, and an 8-bit attribute field • Coordinates in the x direction require ... • Coordinates in the y direction require ... • Bits per line = ... • Storage required per screen page • Bitmap Image • Assuming using 256 colors requiring a single byte per pixel • Storage required per screen page = ...
Motivation for Compression: Continuous Data • Uncompressed speech of telephone quality • Assuming being sampled at 8 kHz and quantized using 8 bit per sample yielding a data stream of 64 Kbit/second • Storage space required per second = ... • Uncompressed stereo audio signal of CD quality • Assuming being sampled at 44.1 kHz and quantized using 16 bits • Data rate = ... • Storage space required per second = ...
Motivation for Compression: Continuous Data • Video sequence • Assuming 25 full frames per second, luminance and chrominance of each pixel are coded using 3 bytes, luminance sampled at 13.5 MHz while chrominance (R-Y and B-Y) is sampled at 6.75 MHz, each, and samples are uniformly coded using 8 bits. • Bandwidth = ... • Data Rate = ... • Storage space required per second = ...
Motivation for Compression: Continuous Data • Processing uncompressed video data streams requires • Storage space in the gigabyte • Buffer space in the megabyte • Data transfer rates of 140 Mbit/s [per unidirectional connection] • These requirements can be considerably lowered by employing compression
Can Multimedia Data be Significantly Compressed? • Redundancy can be exploited to do compression • Spatial redundancy • correlation between neighboring pixels in image/video • Spectral redundancy • correlation among colors • Psycho-visual redundancy • Perceptual properties of human visual system
What Makes “Good” Compression • Quality of compressed and decompressed data should be as good as possible • Compression/decompression process should be as simple as possible • Decompression time must not exceed certain thresholds • [De]/Compression requirements can be divided into • Dialogue mode (video conferencing) • Retrieval mode (digital libraries) • Both
Coding Requirements: Dialogue Mode • End-to-end delay does not exceed 150 ms for compression and decompression alone. • Ideally, compression and decompression should not exceed 50ms in order to ensure natural dialogue. • In addition • delay in the network, • communications protocol processing in the end system, • data transfer to and from the respective input and output devices.
Coding Requirements: Retrieval Mode • Fast forward and fast rewind with simultaneous display (or playback) of the data should be possible • Random access to single images or audio passages in a data stream should be possible in less than 0.5 s. • Maintains interaction aspects in retrieval systems • Decompression of images, video or audio passages should be possible without interpreting all preceding data. • Allows random access and editing
Coding Requirements: Both Modes • Support display of the same data in different systems • Formats have to be independent of frame size and video frame rate • Audio and video compression should support different data rates at different qualities • Precisely synchronize audio and video • Support for economical solution • Software • Few VLSI chips • Enable cooperation of different systems • Data generated on a multimedia system can be reproduced on another system (e.g. course materials).
Compression Types • Physical versus logical Compression • Physical • Performed on data regardless of what information it contains • Translates a series of bits to another series of bits • Logical • Knowledge-based • e.g. United Kingdom to UK • Spatial Compression – 2D or single image • Temporal Compression – 3D or video • Codec – Compression / Decompression • Color / intensity … same thing
Compression Types • Symmetric • Compression and decompression roughly use the same techniques and take just as long • Data transmission which requires compression and decompression on-the-fly will require these types of algorithms • Asymmetric • Most common is where compression takes a lot more time than decompression • In an image database, each image will be compressed once and decompressed many times • Less common is where decompression takes a lot more time than compression • Creating many backup files which will hardly ever be read
Compression Types • Non-adaptive • Contain a static dictionary of predefined substrings to encode which are known to occur with high frequency • Adaptive • Dictionary is built from scratch
Compression Types • Lossless • decompress(compress(data)) = data • Used for computer data, medical images, etc. • Lossy • decompress(compress(data)) data • Some distortion • A small change in pixel values may be invisible • Suited for audio and video
General Data Compression Scheme Encoder (compression) Input Data Codes / Codewords Storage or Networks Codes / Codewords Decoder (decompression) B0 = # bits required before compression B1 = # bits required after compression Compression Ratio = B0 / B1. Output Data
Compression Techniques • Entropy Coding • Semantics of the information to be encoded are ignored • Lossless compression technique • Can be used for different media regardless of their characteristics • Source Coding • Takes into account the semantics of the information to be encoded. • Often lossy compression technique • Characteristics of medium are exploited • Hybrid Coding • Most multimedia compression algorithms are hybrid techniques
Entropy Encoding • Information theory is a discipline in applied mathematics involving the quantification of data with the goal of enabling as much data as possible to be reliably stored on a medium and/or communicated over a channel. • According to Claude E. Shannon, the entropy (eta) of an information source with alphabet S = {s1, s2, ..., sn} is defined as where pi is the probability that symbol si in S will occur.
Entropy Encoding • In science, entropy is a measure of the disorder of a system. • More entropy means more disorder • Negative entropy is added to a system when more order is given to the system. • The measure of data, known as information entropy, is usually expressed by the average number of bits needed for storage or communication. • The Shannon Coding Theorem states that the entropy is the best we can do (under certain conditions). i.e., for the average length of the codewords produced by the encoder, l’, l’
Entropy Encoding • Example 1: What is the entropy of an image with uniform distributions of gray-level intensities (i.e. pi = 1/256 for all i)? • Example 2: What is the entropy of an image whose histogram shows that one third of the pixels are dark and two thirds are bright?
Entropy Encoding: Run-Length • Data often contains sequences of identical bytes. Replacing these repeated byte sequences with the number of occurrences reduces considerably the overall data size. • Many variations of RLE • One form of RLE is to use a special marker M-byte that will indicate the number of occurrences of a character • “c”!# • How many bytes are used above? When do you think the M-byte should be used? • ABCCCCCCCCDEFGGG is encoded as ABC!8DEFGGG • What if the string contains the “!” character? • How much is the compression ratio for this example Note: This encoding is DIFFERENT from what is mentioned in your book
Entropy Encoding: Run-Length • Many variations of RLE : • Zero-suppression: In this case, one character that is repeated very often is the only character used in the RLE. In this case, the M-byte and the number of additional occurrences are stored. • When do you think the M-byte should be used, as opposed to using the regular representation without any encoding?
Entropy Encoding: Run-Length • Many variations of RLE : • If we are encoding black and white images (e.g. Faxes), one such version is as follows: (row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runk begin, col# runk end) (row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runr begin, col# runr end) ... (row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runs begin, col# runs end)
Entropy Encoding: Huffman Coding • One form of variable length coding • Greedy algorithm • Has been used in fax machines, JPEG and MPEG
Entropy Encoding: Huffman Coding Algorithm huffman Input: A set C = {c1 , c2 , ... , cn}of n characters and their frequencies {f(c1), f(c2 ) , ... , f(cn )} Output: A Huffman tree (V, T) for C. 1. Insert all characters into a min-heap H according to their frequencies. 2. V = C; T = {} 3. for j = 1 to n – 1 4. c = deletemin(H) 5. c’ = deletemin(H) • f(v) = f(c) + f(c’) // v is a new node • Insert v into the minheap H • Add (v,c) and (v,c’) to tree T making c and c’ children of v in T 9. end for
Entropy Encoding: Huffman Coding • Example
Entropy Encoding: Huffman Coding • Most important properties of Huffman Coding • Unique Prefix Property: No Huffman code is a prefix of any other Huffman code • For example, 101 and 1010 cannot be Huffman codes. Why? • Optimality: The Huffman code is a minimum-redundancy code (given an accurate data model) • The two least frequent symbols will have the same length for their Huffman code, whereas symbols occurring more frequently will have shorter Huffman codes • It has been shown that the average code length of an information source S is strictly less than + 1, i.e. l’ < + 1
Entropy Encoding: Adaptive Huffman Coding • The Huffman method assumes that the frequencies of occurrence of all the symbols of the alphabet are known apriori. • This is rarely the case in practice • Semi-adaptive Huffman coding has been employed where data is read twice, the first pass being to determine the frequencies • Disadvantage: Too slow for real-time applications • Another solution is Adaptive Huffman Coding • Employed by Unix’s “compact” program.
Entropy Encoding: Adaptive Huffman Coding • Decoder “mirrors” the operations of the encoder, as they both may occur at different times • Main idea of the algorithm is as follows • Encoder and Decoder both start with an empty Huffman Coding Tree • No symbol is assigned codes yet. • First symbol read is written on the output stream in its uncompressed form • In fact, each uncompressed character being read for the first time is read this way • That is why we need an escape character to determine when we read an uncompressed character for the first time. • This escape character is denoted by NEW and given frequency 0 all the time • This symbol is then added to the tree and a code is assigned to it.
Entropy Encoding: Adaptive Huffman Coding • Next time this symbol is encountered, its code will be written in the output screen and its frequency is increased by 1. • Since this modifies the tree, it is checked whether it is a Huffman tree or not • If not, it will be rearranged, through swaps, and new codes will be assigned • Sibling Property • Must be preserved during swaps • All nodes are arranged in the order of increasing counts, left to right and bottom to top. • During a swap, the farthest node with count N is swapped with the node whose count has just been increased to N + 1.