240 likes | 372 Views
Multimedia Communications EG 371 and EG 348. Dr Matthew Roach dr.matt.roach@iee.org Lecture 3 Fundamentals of compression. Data Information. Different quantities of data used to represent same information people who babble, succinct Redundancy
E N D
Multimedia CommunicationsEG 371 and EG 348 Dr Matthew Roach dr.matt.roach@iee.org Lecture 3 Fundamentals of compression Multimedia communications EG-371 Dr Matt Roach
Data Information • Different quantities of data used to represent same information • people who babble, succinct • Redundancy • if a representation contains data that is not necessary • Compression ratio CR = • Relative data redundancy RD = Multimedia communications EG-371 Dr Matt Roach
Types of redundancy • Coding • if grey levels of a frame are coded in such away that uses more symbols than is necessary • Inter-pixel • can guess the value of any pixel from its neighbours • Psyco-visual • some information is less important than other info in normal visual processing • Data compression • when one / all forms of redundancy are reduced / removed • data is the means by which information is conveyed Multimedia communications EG-371 Dr Matt Roach
Inter-pixel redundancy, IPR • Correlation between pixels is not used in coding • Correlation due to geometry and structure • Value of any pixel can be predicted from the value of the neighbours • Information carried by one pixel is small • Take 2D visual information • transformed NONVISUAL format • This is called a MAPPING • A REVERSIBLE MAPPING allows original to be reconstructed after MAPPING • Use run-length coding Multimedia communications EG-371 Dr Matt Roach
Psyco-visual redundancy, PVR • Due to properties of human eye • Eye does not respond with equal sensitivity to all visual information (e.g. RGB) • Certain information has less relative importance • If eliminated, quality of image is relatively unaffected • This is because HVS only sensitive to 64 levels • Use fidelity criteria to assess loss of information Multimedia communications EG-371 Dr Matt Roach
Coding redundancy • Can use histograms to construct codes • Variable length coding reduces bits and gets rid of redundancy • Less bits to represent level with high probability • More bits to represent level with low probability • Takes advantage of probability of events • Video frames made of regular shaped objects / predictable shape • Objects larger than pixel elements • Therefore certain grey levels are more probable than others • i.e. histograms are NON-UNIFORM • Natural binary coding assigns same bits to all grey levels • Coding redundancy not minimised Multimedia communications EG-371 Dr Matt Roach
Entropy encoding • Run length encoding (RLC) • Loss less • Huffman encoding • Loss-less • Differential encoding • Used in JPEG encoding Multimedia communications EG-371 Dr Matt Roach
Run length coding (RLC) • Represents strings of symbols in an image matrix • FAX machines • records only areas that belong to the object in the image • area represented as a list of lists • Video frame row described by a sublist • first element = row number • subsequent terms are co-ordinate pairs • first element of a pair is the beginning of a run • second is the end • can have several sequences in each row • Also used in multiple brightness images • in sublist, sequence brightness also recorded Multimedia communications EG-371 Dr Matt Roach
Lossless: Run Length Encoding (RLE) • RLE • reduce the size • repeating string of characters • run of symbols into two bytes • count • Symbol • content of data affects the compression ratio • Not as high compression ratio as other methods • easy to implement and is quick to execute • TIFF, BMP and PCX Multimedia communications EG-371 Dr Matt Roach
RLE Example • Consider an example with 16 characters string of : 000ppppppXXXXaaa • This string of characters can be compressed to form 306p4X3a • 16 byte string • 8 bytes • compression ratio of 2:1. Multimedia communications EG-371 Dr Matt Roach
Run Length Encoding (RLE) • Long runs are rare in certain types of data (eg ASCII) • To benefit from RLE there needs to be 2 characters of more if not, we end up expanding rather than compressing!! Multimedia communications EG-371 Dr Matt Roach
Histograms, h(l) • Counts the number of occurrences of each grey level in an image • l = 0,1,2,… L-1 • l = grey level, intensity level • L = maximum grey level, typically 256 • Area under histogram • Total number of pixels N*M • unimodal, bimodal, multi-modal, dark, light, low contrast, high contrast Multimedia communications EG-371 Dr Matt Roach
Probability Density Functions, p(l) • Limits 0 < p(l) < 1 • p(l) = h(l) / n • n = N*M (total number of pixels) Multimedia communications EG-371 Dr Matt Roach
Lossless: Huffman compression • Huffman compression • reduces average code length • to represent symbols of an alphabet • occur frequently • short length codes • constructing a binary tree • arranging the symbols • adding two lowest probabilities • Sum of last two symbols is 1. • Code words formed tracing tree path • assigning 0s and 1s to the branches Multimedia communications EG-371 Dr Matt Roach
Huffman coding • Determine the Huffman code for the following set of symbols: • Step 1 – List symbols in order of decreasing probability Multimedia communications EG-371 Dr Matt Roach
Step 2 – Get 2 symbols with lowest probability. Give the combined symbol a new name: • m2(0.15) + m0(0.1.0) A(0.25) • Step 3 – Create a new list and repeat the process: Multimedia communications EG-371 Dr Matt Roach
C 0.61 m1 0.36 m1 0.36 B 0.39 D 1.0 m3 0.20 A 0.25 m1 0.36 B 0.39 m4 0.19 m3 0.20 A 0.25 m2 0.15 m4 0.19 m0 0.10 An alternative approach is to construct this tree Multimedia communications EG-371 Dr Matt Roach
Root 0 1 C 0.61 B 0.39 1 0 1 0 m1 0.36 A 0.25 m4 0.19 m3 0.20 0 1 m2 0.15 m0 0.10 Assign bits (0,1) to the tree branches Codewords determined by tracing the path from root node to symbol leaf: Compression is achieved by allocating frequently occurring symbols with shorter codewords. Multimedia communications EG-371 Dr Matt Roach
How much compression? • 5 symbols • 3-bits for each symbol. • message [m0m1m2m3m4] • require 5x3=15 bits. • Huffman coding • require 12 bits compression of 15:12 Multimedia communications EG-371 Dr Matt Roach
Example • Consider the message babbage babble baggage label bagel • Construct a Huffman code and determine the compression ratio. Multimedia communications EG-371 Dr Matt Roach
Solution • construct a probability table • counting occurrence of each letters Multimedia communications EG-371 Dr Matt Roach
Solution C 0.6 b 0.3 b 0.3 B 0.4 D 1.0 a 0.233 A 0.3 b 0.3 B 0.4 g 0.166 a 0.233 A 0.3 e 0.166 g 0.166 1 0.133 Multimedia communications EG-371 Dr Matt Roach
Solution Root 0 1 C 0.6 B 0.4 1 0 1 0 b 0.3 A 0.3 g 0.166 a 0.233 0 1 e 0.166 l 0.133 Multimedia communications EG-371 Dr Matt Roach
Solution • babbage babble baggage label bagel • 5 symbols (bagel) • 30 characters • uncompressed • 3 bits/symbol • 90 bits. • Huffman gives a compression of approx. 9:7 Multimedia communications EG-371 Dr Matt Roach