1.39k likes | 1.62k Views
Audio & Video Compression and its Application in Consumer Products. Agenda. Introduction - The evolution of Audio/Video consumer products and the role of compression techniques. Audio & Video compression principles Audio compression Video compression Audio/Video synchronisation
E N D
Audio & Video Compressionand its Application inConsumer Products
Agenda • Introduction - The evolution of Audio/Video consumer products and the role of compression techniques. • Audio & Video compression principles • Audio compression • Video compression • Audio/Video synchronisation • The MPEG model and its situation in a communication context • Application to DVD (Digital Versatile Disc) • Application to DVB (Digital Video Broadcasting) • Conclusion
Agenda • Introduction - The evolution of Audio/Video consumer products and the role of compression techniques. • Audio & Video compression principles • Audio compression • Video compression • Audio/Video synchronisation • The MPEG model and its situation in a communication context • Application to DVD (Digital Versatile Disc) • Application to DVB (Digital Video Broadcasting) • Conclusion
Number of transistors per square inch doubles every 18 months Moore’s law
Moore’s law today Cost of a transistor divided by one million in 30 years
Moore’s law today (2) • “Self-fulfulling prophecy” (not automatic)= roadmap for the semiconductor industry
Moore’s law today (3) • Roadmap for semiconductor industry= only certainty in the current undefined future • Moore’s law will continue to apply: 20 years • Economical limitation ? • Power consumption (Moore’s low in reverse direction) • Architectural gap between IP-blocks & application (middleware still more complex…) • Progresses in semiconductors = fuel the innovation = fuel the software revolution = fuel the wireless revolution (WLAN, WPAN, WBAN, …) • … • Examples: WBAN & sensors, RFID applications, camera to swallow, flexible display…
The evolution of some CE products (1) DCC Consumer Computer CD CD-i DVD STB Communication
The evolution of our CE products (2) • The Residential Gateway (Set-Top-Box) as the link between the home and the world-wide information infrastructure. World-wide communication infrastructure Home Network STB
The evolution of our CE products (3) • The STB (in home) as the gateway to various services. Local Server provides 2 kind of services: • BroadcastAnalogue & digital TV, NVOD, PPV • Point-to-point (Home to local server)Home shopping, VOD, e-mail, Web browsing, PC connection... Up to 800 homes Local server Network Internet Local server
The evolution of our CE products (4) • The STB as a key element of the home network To telephone Network Computer Residential Gateway To satellite Network Home Network Television To cable Network Disk Recorder DVD Jukebox
The evolution of our CE products (5) • 3C Convergence - Progressive • New products combine all 3 functions • Products always more and more complex • Products have always new features • Lifetime of products is always shorter
Factors enabling such evolution Compression is one among the various factors (all powered by semiconductor progresses) that enable multimedia.
BUT !! • Convergence of technologies (consumer, communication, computer) All products combine all three technologies • BUT ! • Divergence of applications • Home consumer, Multimedia phone, Camera, PDA, Office computer, Automotive… • High number of potential productsTechnology push Market pull (user centric approach)
Agenda • Introduction - The evolution of Audio/Video consumer products and the role of compression techniques. • Audio & Video compression principles • Audio compression • Video compression • Audio/Video synchronisation • The MPEG model and its situation in a communication context • Application to DVD (Digital Versatile Disc) • Application to DVB (Digital Video Broadcasting) • Conclusion
Compression in first A/V Products (1) • First Audio/Video products made compression without knowing it was compression. • How ?By removal of irrelevancies • Audio and Video characteristics
Compression in first A/V Products (2) • Audio productsFrom 2 to 7.1 channels are enough to provide the spatial resolution. • Video productsThree colours (RGB) are enough to provide the spectral resolution.
The need for more compression (1/5) • Audio: Compression needed in spectral domain • Bitrate of a stereo audio source (CD-DA encoding)Sampling frequency : 44.1 kHzStereo16-bit per sampleBitrate = 44100 * 2 * 16 = 1.41 Mbit/sec
The need for more compression (2/5) • Video: Compression needed in spatial domain • Bitrate of a video source (CCIR 601 - 50 Hz countries) 25 images per secondYUV coding (Y: luminance - U,V : Chrominance)Y: 8 bit per pixel - U,V: 1 pixel on 2 coded, 8 bit per pixelBitrate = (576*720)*25*16 = 166 Mbit/sec
The need for more compression (3/5) • Channels availables for AV transmission • Analog television channel (compatibility)Cable (bandwidth = 8 MHz) Satellite (Bandwidth = 30-40 MHz) Capacity around 40 Mbit/sec • Compact disc (CD)For 74 min. play time : 1.41 Mbit/sec
The need for more compression (4/5) • MPEG-1 target (Moving Picture Expert Group)(Video-CD : 74 min. constraints)But quality was judged too poor (about VHS quality)
The need for more compression (5/5) • MPEG-2 target • Program stream (DVD) • Transport stream (DVB)
Principles of compression (1/2) • Compression (or source coding) is achieved by suppressing information : • redundant information • irrelevant information • Suppression of redundant information lossless compression example: PCM to DPCM,DCTThe original signal and the one obtained after encoding and decoding are identical
Principles of compression (2/2) • Suppression of irrelevant information lossy compression Example: bandwidth limitation, masking in audio • The original signal and the one obtained after encoding and decoding are different but are perceived as identical
Audio Demonstration From “Borderline” Madonna - Stereo - 16 bit/channel Compression used AAC Original 705 kbps Compression 128 kbps 64 kbps 32 kbps 16 kbps Decompression -
MOS scale (1/2) • Signal distortion is not a good measure of the performance of a lossy compression method an other method is necessary: MOS scale (Mean Opinion Score) • The five-grade CCIR impairment scale (Rec.562)1(Very annoying), 2(Annoying), 3(Slightly annoying), 4(Perceptible but not annoying), 5(Imperceptible) • Example:Double blind test
Compression to VBR or CBR • CBR (Constant Bit Rate) vs VBR (Variable Bit Rate) • Scene more complex Higher bit rate for same quality • CBR variable quality (example : Video CD artefact) • Constant quality VBR necessary (e.g.: DVD-Video)
The compression trade-off • Compression techniques are still making progress • Trade-off Complexity/Quality/Bit Rate • New technique may result in new trade-off Complexity Quality MPEG Layer 2 MPEG Layer 1 MPEG Layer 3 Other Technique Speech coding MPEG AAC Bitrate
Agenda • Introduction - The evolution of Audio/Video consumer products and the role of compression techniques. • Audio & Video compression principles • Audio compression • Video compression • Audio/Video synchronisation • The MPEG model and its situation in a communication context • Application to DVD (Digital Versatile Disc) • Application to DVB (Digital Video Broadcasting) • Conclusion
Audio compression in MPEG (1/5) • Based on psycho-acoustics • Compress the bit rate without affecting the quality perceived by the human ears (based on the imperfection of human ears) • Removal of irrelevancies • 4 main principles : • Threshold of audibility • Frequency masking • Critical bands • Temporal masking
Audio compression in MPEG (2/5) • Principle 1: Threshold of audibility Not all frequency components need to be encoded with the same resolution. Nr_bit(f) = (signal/threshold)db/6
Audio compression in MPEG (3/5) • Principle 2: Frequency masking Analysis of the incoming signal
Audio compression in MPEG (4/5) • Principle 3: Critical bands • Human ear may be modelled as a collection of narrow band filters • Bandwidth of these filters = critical band • critical band(<100 Hz) for lowest audible frequencies( 4 kHz) for highest audible frequencies • The human ear cannot distinguish between two sounds having two different frequencies in a critical band.Example : when we hear 50 & 60 Hz at the same time we cannot distinguish them. • Consequence : Noise masking threshold depends solely of the signal energy within a limited bandwidth domain.The largest sound is taken as the representative of the critical band.Necessity to analyse the signal at 100Hz resolution at low-frequency
Audio compression in MPEG (5/5) • Principle 4: temporal masking selection of the frame duration for frequency analysis and encoding.
An enabling tool : the filter bank (2/2) • After decimation, same bit rate as original signal, but signal decomposed in various frequency ranges possibility of frequency based compression • Filter-bank:Aliasing occurs due to decimation • It exists a class of filter-bank such that aliasing is compensated in synthesis filter : QMF (Quadrature Mirror Filter) but high complexity • Pseudo-QMF (Polyphase filter bank) is used. Has good compromise between computation cost and performances • Remark : Aliasing may occur if signal in a adjacent band is not reconstructed with an adequate resolution.
The MPEG filter bank • In MPEG, 32 equal-width subbands are used • For each subband, necessity to define the maximum signal level and the minimum mask level. • BUT, at low frequencies : bandwidth of subbands > critical bands • Necessity to rely on an FFT in order to compensate the lack of frequency selectivity of filterbank at low frequencies
Psychoacoustic model & Bit allocation(1/2) • An FFT compensates the lack of frequency selectivity of filterbank at low frequencies • FFT : 512 samples (layer 1) & 1024 samples (layer 2)resolution for layer 1 : Fs/512 < 100 Hz • A psychoacoustic model based on the FFT computes the signal to mask ratio for each subband (1 bit = 6db) • Ideally, after allocation, quantisation noise < masking level • The scale factors are computed for each subband from the filterbank output (floating point representation of samples) • The bit allocator adjust the bit allocation in order to meet the bitrate requirement. • The bitstream syntax is dependent of the MPEG layer (See later)
The MPEG decoder • Decoder is simple (Complexity is at encoder side) • Remark 1: DCC is MPEG-1 but DCC encoder has no FFT, relies only on power in the 32 subbands Higher bit rate (320 kbps) to reach transparent quality • Remark 2: MPEG specifies bitstream syntax only. Encoder are given for information. Possibility of improvement.
Audio features in MPEG • MPEG1 : • Mono/stereo/dual/joint stereo (Possibility Dolby surround) • Sampling frequencies : 32, 44.1 & 48 kHz • 3 layers : trade-off complexity/delay versus coding efficiency of compression • Various bit rate : trade-off quality versus bitrate • MPEG2 : • 5.1 channels • Sampling frequencies extended to 16, 22.05 & 24 kHz
Dolby surround principles (1/5) • 4 channels carried by stereo pair same tools as for stereo • Compatible with stereo installation
Dolby surround principles (3/5) • Simple decoder provides only 3 dB channel separation(See previous equations) Need for improvement Dolby Surround pro-logic decoder (next slide)
Dolby surround principles (4/5) Dolby surround pro-logic decoder
Dolby surround principles (5/5) Performance of Dolby pro-logic decoderChannel separation larger than 35 dB
MPEG-2 surround configurations (front/back) 3/2 3/0 + 2/0 3/1 2/2 2/0 + 2/0 3/0 2/1 2/0 1/0 + LFE (opt.) (Fs/96) 15-120Hz 5.1 surround sound