350 likes | 651 Views
Multimedia Compression. B90901134 陳威尹. Why Compress. Raw data are huge. Audio: CD quality music 44.1kHz*16bit*2 channel=1.4Mbps Video: near-DVD quality true color animation 640px*480px*30fps*24bit=220Mbps Impractical in storage and bandwidth. Outline. Generic Compression Overview
E N D
Multimedia Compression B90901134 陳威尹
Why Compress • Raw data are huge. • Audio:CD quality music44.1kHz*16bit*2 channel=1.4Mbps • Video:near-DVD quality true color animation640px*480px*30fps*24bit=220Mbps • Impractical in storage and bandwidth
Outline • Generic Compression Overview • Content specific Compression • Lossy Compression
Introduction to Generic Compression Algorithm Lossless Compression
Generic Compression • Also called Entropy Encoding • Lossless Compression Algorithms • Entropy can defined as: • Need statistical knowledge of data • Well-known Algorithms: • Rice coding • Huffman coding • Arithmetic coding
Huffman encoding Input: ABACDEAACCAABEAABACBDDABCADDBCEAEAAADBE Order-0 model Symbol A B C D E Count 15 7 6 6 5 total:39*3=117 bits Output: 15*1+(7+6+6+5)*3=87 bits Compression ratio: 117/87 = 1.34
Property of Huffman encoding • Easy to implement, high encoding speed • Unique Prefix Property: no code is a prefix to any other code • Adaptive Huffman encoding: • statistical knowledge not available • update Huffman tree when needed
Arithmetic Encoding • Symbol X, Yprob(X) = 2/3prob(Y) = 1/3
Property of Arithmetic Encoding • Prevent entropy wasting in Huffman coding, for the number of bits to represent a symbol can be non-integer • About 5~10% smaller than Huffman coding • Computational intensive • US patented!! • Both Huffman and Arithmetic are used in the entropy encoding stage in JPEG
Application of General Compression • Generic file compression like Zip, Rar, gzip, bzip, etc. • Final stage of content specific compression • JPEG uses Huffman or Arithmetic • Monkey’s Audio (ape) uses Rice • Lossless Audio (La) uses Arithmetic
Content specific Compression Further De-correlation
De-correlation • Correlation means redundancy • However, general algorithm may not find content-specific correlation • General algorithm of higher order may not be efficient enough • No matter lossy or lossless, multimedia file format use content-specific pre-filter as 1st step to reduce data redundancy.
Correlation in Multimedia • Audio: • Temporal, Channel • Still Image: • Color space, Spatial, Stereo • Video: • Temporal
Audio Channel Correlation • Correlation between L/R channels • L/R to mid/pass band conversion • More complex decorrelation in more channels
Color Space Correlation • Correlation between color channels • map RGB to YUV color space Y = 0.299*R + 0.587*G + 0.114*B U = -0.169*R - 0.331*G + 0.500*B + 128.0 V = 0.500*R - 0.419*G - 0.081*B + 128.0 • Example in PNG
R 95KB G 96KB B 98KB Y 97KB U 32KB V 37KB Color Space Correlation-- RGB to YUV Conversion
Video Channel Correlation • Multi-view channel in 3D video • convert to Image and Depth channel • Disparity Estimation (like Motion Estimation)
Search Range Motion Vector Reference Frame Current Frame Video Temporal Correlation • Similarity between adjacent frames • Motion estimation and motion compensation (mostly Lossy)
Lossless is not enough! • The best lossless audio and image compression ratio is normally a half • Lossy audio compression like mp3 or ogg achieve 1/20 ratio while remain acceptable quality, and 1/5 ratio for impeccable quality • Lossy video compression reduce a film to 1/300 size
Lossy Compression Loss of data lead to higher compression ratio
Lossy Compression • Massively reduce information we don’t notice • Highly content specific • Psychology
Lossy Audio Compression • Frequency domain • Quantization • The importance varies in bands • Higher frequency, larger quantum • Psychoacoustics • Pitch resolution of ear is only 2Hz without beating • Threshold of hearing varies in bands • Simultaneous and temporal masking effect
Transform Quantization Entropy Coding Image data Output data Lossy Image Compression • Frequency domain • Discrete Cosine Transform (in Jpeg) • Discrete Wavelet Transform (in J2k) • Quantization • Reduce less important data
DCT Discrete Cosine Transform 8x8 Quantization Table Huffman Coding JPEG Transform Quantization Entropy Coding DWT Discrete Wavelet Transform Quantization for each sub-band Arithmetic Coding J2K Jpeg2000 vs. Jpeg
Lossy Image Compression in Practice (1) • Original
Lossy Image Compression in Practice (2) • Transform domain coefficients. • Only a few components are visible for each 8x8 block. • The DC component is in the upper left of each block
Lossy Image Compression in Practice (3) • After quantization and IDCT. • Note clearly seen blocky effect. • Compression ratio = 17.8:1 with an SNR of 20.1 dB, not including entropy encoding
Motion Compensation Without motion compensation With motion compensation
Frame Type • Intra Frame (I) • Predictive Frame (P) • Bidirectional predictive Frame (B)
Video Compression Demo • Motion Vector and bandwidth overlaid on mpeg4 video using ffdshow-20041012
Reference • Lossless Compression Algorithmshttp://www.cs.cf.ac.uk/Dave/Multimedia/node207.html • Monkey’s Audiohttp://www.monkeysaudio.com/theory.html • Lossless Audio (La)http://www.lossless-audio.com/theory.htm • Compression and speed of lossless audio formatshttp://web.inter.nl.net/users/hvdh/lossless/main.htmhttp://members.home.nl/w.speek/comparison.htm • http://www.wordiq.com/definition/Wavelet_compression • http://www.wordiq.com/definition/Psychoacoustics • http://www.wordiq.com/definition/MP3 • H.264http://www.komatsu-trilink.jp/device/pdf11/UBV2003.pdf