490 likes | 899 Views
Image Synthesis. Image Compression. Motivation. 1920 x 1440 x 32 bit = 10,54 MB High Quality JPEG 0,9 MB. Classification. Lossless TGA GIF PNG … Lossy JPEG JPEG 2000 MPEG. Lossless Compression schemes. RLE LZ LZ77 LZ78 / LZW Huffmann Deflate / “ZIP” GIF PNG.
E N D
Image Synthesis Image Compression
Motivation 1920 x 1440 x 32 bit = 10,54 MB High Quality JPEG 0,9 MB
Classification • Lossless • TGA • GIF • PNG • … • Lossy • JPEG • JPEG 2000 • MPEG
Lossless Compression schemes • RLE • LZ • LZ77 • LZ78 / LZW • Huffmann • Deflate / “ZIP” • GIF • PNG
Run length Encoding Example: WWWWWWWWWWBWWWWWWWWWWBBB 10WB10W3B What to do if numbers appear in the code? Uses control char (X in this case) and only encode sequences of more than two chars. 121211111111122222212111 1212X91X6212111 What if X appears in the code? Use XX to encode a regular X (and hope that you will not see too many Xs in you data).
LZ - Lempel-Ziv LZ77 (1977) Idea: Conversion of the data in (hopefully only few) 3-tupel Implementation: sliding window + look ahead approach Example: ANANAS
Decoding LZ77 (1977) Idea: Conversion of the data in (hopefully only few) 3-tupel Implementation: sliding window + look ahead approach Example: ANANAS
Implementation Details Encoding of periodic entries Example: 0101010101010101010101… (length n)
Implementation Challenges Finding the right position Example: 0120101601015
Problems Encoding of non-periodic Example: 0123456789ABCDE… “Compression” actually increases data size by a factor of three Combination with Huffman encoding Deflate (later)
LZW - Lempel-Ziv-Welch Dictionary based compression with adaptive dictionary Usually 12 bit codebook, first 8 bits reserved for ASCII LZWLZ7… L Z W <256> 7…
Decoding L Z W <256> 7… LZWLZ7…
Features and Issues • „Codebook overflow/reset“ • Simple deterministic encoding / decoding • Worst-Case 8 bit 12 bit (or 8 bit -> dict size) • was subject to software patent until 2002 (2006)
Huffman Encoding 1951 Given:A set of Symbols and their weights (usually probabilities). Find:A prefix-free binary code (a set of codewords) with minimum expected codeword length (equivalently, a tree with minimum weighted path length).
Variable Length Problem Example: What is the meaning of 10010? ACA (10 0 10) or ABC (10 01 0)? Prefix – free codes
Huffman Encoding Start with as many leaves as there are Symbols. Queue all leaf nodes into the first queue (in order). While there is more than one node in the queues: • Remove two nodes with the lowest weight (frequency of appearance) from the queues. • Create a new internal node, with the two just-removed nodes as children (either node can be either child) and the sum of their weights as the new weight. • Update the parent links in the two just-removed nodes to point to the just-created parent node. • Queue the new node into the second queue. The remaining node is the root node; the tree has now been generated.
Example Frequencies: Codes:
Properties • Huffman coding is optimali.e. it generates an optimal tree for a given input • Requires knowledge of frequencies • can be extended to do adaptive tree updates • Can be combined with LZ77 a post processing step, thus compressing the lengthy tuples Deflate, used primarily in ZIP, gzip, etc.
Now back to images BMP, TGA store images as an RLE array of n-bit (n usually 24/32) values GIF Uses 256 colors (8bit) and uses LZW compressiona few GIF versions exist, usually GIF89a is used today, allowing for a single transparent color and stacking multiple images together to generate an animation
PNG – Portable Network Graphic Initially developed to circumvent GIFs, patent pending LZW compression. • uses prefilter + LZ77 + Huffman for compression • supports not only 8bit palettes but also 1,2,4,8 or 16bit per channel color information • Supports 8 or 16 bit alpha channel • Now finally supported by all major browsers • No support for animation
PNG Prefilter Used per line
Paeth Predictor function PaethPredictor (a, b, c) { a = left, b = above, c = upper left p := a + b - c // initial estimate pa := abs(p - a) // distances to a, b, c pb := abs(p - b) pc := abs(p - c) // return nearest of a,b,c, // breaking ties in order a,b,c. if pa <= pb AND pa <= pc then return a else if pb <= pc then return b else return c }
Paeth Predictor: Idea c b Goal: Minimize Filter(x) a x “Optimal way” to do this: Compute abs(a-x), abs(b-x), abs(c-x) and store minimal value But this would require two bits to remember what difference was used. Instead: compute a “good” estimate of x using only a,b,c (called p) and find the minimalabs(a-p), abs(b-p), abs(c-p) to select which of abs(a-x), abs(b-x), abs(c-x) is used
Lossy Compression JPEG (Joint Photographic Experts Group) • Lossless compression not working well for photos • Useseslossy compression withadjustable compression ration • Can do lossless • Fast deconding • Works for all types of static images,no restrictions on color depth Compression ratio
Compression Overview • Color Space conversion to YUV • Block based color sub sampling • Discrete Cosine-transformation (DCT) • Quantization of the DCT Coefficients • Coefficient serialization(dove tailing) • Coefficient encoding
YUV ConversionY = 0.30 R + 0.59 G + 0.11 BU = -0.17 R - 0.33 G + 0.50 BV = 0.50 R - 0.42 G - 0.08 B RGB ConversionR= Y + 1.40 VG= Y - 0.34 U - 0.71 VB = Y + 1.78 U Example: Red : Y= 77 U= -43 V= 127 Green: Y= 151 U= -84 V= –107 Blue: Y= 28 U= 127 V= -20 YUV • YUV • Y is Luminance • U Chrominance: Color change towards blue • V Chrominance: Color change towards red
YUV Conversion Example Y U V
YUV Sub sampling Split YUV-Channels Eye is more sensitive to brightness than to color changes: Color sub sampling • First digit: Luma horizontal sampling reference • Second and Third: U and V (chroma) horizontal factor (relative to first digit) Except when third is zero. Zero indicates that V horizontalfactor is equal to second digit, and, in addition, both U and V are subsampled 2:1 vertically Default: Mode 4:2:2 Full Quality : Mode 4:4:4
Sub sampling Without Sub sampling With Sub sampling 4:4:4 bestquality 4:2:2 bestquality 4:4:4 worstquality 4:2:2 worstquality
Preparation for DCT • From now on consider YUV planes separately • Index shift • Transform [0 - 255] to [-128 - 127] • Blocking • Always consider a 8x8 data block for the DCT
DCT • DCT (1D) • Related to FT in that it transforms a spacial signal to the frequency domain, however the DCT avoids the imaginary sine part by representing the signal as a sum of cosine waves DCT(u): Cosine amplitude for frequency u N: number of pixels 8x8=64 for JPEG f(x): Pixel value at position x C: correction factor
DCT • DCT (2D) Large, regular areas are stored in the upper right while higher frequencies are stored towards the lower left
DCT Pixel (IDCT) DCT
DCT Pixel (IDCT) DCT
DCT DCT IDCT Result
DCT DCT IDCT Result
DCT DCT IDCT Result
DCT DCT IDCT Result
DCT DCT IDCT Result
DCT DCT IDCT Result
DCT Properties • A real life photo has the following properties after DCT • the coefficient DC is by far the largest • the ACs become smaller with increasing frequency • most of the ACs are close to zero • Properties of the DCT? • The DCT in general is a reversible transform thus applying the DCT and back transforming is lossless (in theory)
Quantization For JPEG the DCT gives us 64 Values, each of them is divided by a quantization matrix entry and rounded afterwards • Data is lost here! Quantizermatrix Quality SettingsAdjust Matrix Q BacktransformationMultiply DCT-value with Quantizermatrix:
Quantization DCT Matrix Q After quantization
DC Encoding Store first (upper left DC) as is, encode other values as differences from predecessor
DC Encoding Categorize values Huffman encode category sequence and append index into category results in 2 tuple
AC Encoding Symbol-1 (Runlength, Size) Symbol-2 (Amplitude) Runlength = Zero count Size = Category Amplitude = Index First block is Huffman encoded (or table is used) and second block is appended Runlengthis 4 bit [0-15] DC AC (2)(3), (1,2)( 1), (0,1)( 0), (0,1)( 0), (0,1)( 0), (2,1)(0), (0,0) ~ 51Bit
JPEG 2000 • does not necessarily split the image into blocks • uses wavelets instead of DCT • uses zero-tree encoder (EBCOT) instead of Huffman • embedded code allows for better preview • better compression • better quality JPG JPEG 2000
H.261 • Divide image into 8x8 blocks • Motion compensation via cross correlation • Subtract from previous frame • Compress JPEG like 4:2:0 • Uses one of 31 fixed quantizer matrices + one uniform matrix for DC