190 likes | 215 Views
MPEG-3 For Audio. Presented by: Chun Lui Sunjeev Sikand. History of MP3. In 1987, the Fraunhofer IIS started to work on perceptual audio coding in the framework of the EUREKA project EU147, Digital Audio Broadcasting (DAB).
E N D
MPEG-3 For Audio Presented by: Chun Lui Sunjeev Sikand
History of MP3 • In 1987, the Fraunhofer IIS started to work on perceptual audio coding in the framework of the EUREKA project EU147, Digital Audio Broadcasting (DAB). • In a joint cooperation with the University of Erlangen (Prof. Dieter Seitzer), the Fraunhofer IIS finally devised a very powerful algorithm that is standardized as ISO-MPEG Audio Layer-3 (IS 11172-3 and IS 13818-3). • MPEG-3 =Moving Picture Experts Group Audio Layer – 3
Background: How CD stores music • Music is sampled at 44,100 times/second • Each sample is 2 bytes • Sample is taken separately for left and right speakers • 44,100 samples/sec * 16 bits/sample * 2 channels = 1,411,200 b/sec • 3-minute song - 1,411,200 b/sec * 180 sec = 31,752,000 bytes • Yikes!
Three Layers, Three Applications Name Compression Factor Bit rate Application Layer 1 1:4 384 Kbit/sec. Digital Compact Cassettes Layer 2 1:6 to 1:8 256 to192 Kbit/ sec. Digital Radio Layer 3 1:10 to 1:12 128 to 112 Kbit/ sec. Digital Internet Music The 3 MPEG layers
An Overview of MP3 Quality Levels Sound Quality Mode Bit rate Compressions Rate Telephone mono 8 Kbit/s 96:1 Better than SW Radio mono 16 Kbit/s 48:1 Better than MW Radio mono 32 Kbit/s 24:1 Similar to VHF Radio stereo 56 to 64 Kbit/s 26 to 24:1 Similar to CD stereo 96 Kbit/s 16:1 CD quality stereo 112 to 128 Kbit/s 14 to 12:1 MP3 Performance
How Does It Work? • Every MP3 encoder uses two approaches • First it compresses the analog audio stream with the help of perceptual noise shaping. • Second, the compressed and frequency cleansed data is shrunk again using Huffmann encoding.
Perceptual Noise Shaping • Much like YUV encoding and other compression scheme, it takes advantage of Psychoacoustics… • Human ear cannot distinguish the difference between two similar frequencies • There are sounds that the human ear cannot hear • There are sounds that the human ear hears much better than others • If there are two sounds playing simultaneously, higher volume mask lower volume
Perceptual Noise Shaping Example • If there is a louder sound in one band, don’t really need to encode all the other bands
Analytical Filter • The audio signal passes through a filter bank which divides the audio signal in 576 areas (sub-bands). This requires very complex filters. Here, MP3 encoders work with the well-known Discrete Cosine Transformation. • In real time it calculates unnecessary frequencies and eliminates them iteratively (repeatedly) until the best possible result is achieved.
Masking Threshold Evaluation • At the same time, the audio signal passes through the psychoacoustic model. For every sub-band of the entire signal spectrum, the masking threshold is determined using the Discrete Fourier Transformation. • Joint stereo coding can then be done to take exploit the fact that both channels of a stereo channel pair contain by far the same information. These stereophonic irrelevancies and redundancies are exploited to reduce the total bitrate.
Quanitzation and Encoding • When quantizing the sample another starting point for data reduction arises. Every sample is made up of 16 bits, but not all 16 are necessarily needed in order to represent the sound. As such, the leading nulls of a 16-bit sample may be left out • At the same time, individual samples are analyzed and compressed again using Huffman encoding. This produces a further reduction of data of about 20 percent.
Bitstream creation • Now all the data has been gathered. • Everything is recorded and digitalized. • Finally, the encoder forms the bit stream, which ultimately represents the MP3 file: the compressed data is compiled into so-called frames. • For MP3s, there are 1152 scanning values per frame (32 Sub-bands * 36 Samples). Every frame consists of a header, a sum test check, the audio data and sometimes a bit-reservoir.
MP3 Bit Rate • Measure at bits per second • Generally speaking the higher the bit rate of a MP3 file the higher quality it is • There are 3 types of bit rate format: • Constant Bit Rate (CBR) • Variable Bit Rate (VBR) • Average Bit Rate (ABR)
Constant Bit Rate • Same bit rate for the entire file • Stream an audio more efficiently • Quality of the encoded content is not constant • Because some content is harder to compress than others • Different song encoded with same bit rate may result in different quality
Variable Bit Rate • Try to achieve best quality • Use different bit rate for each frame of the MP3 file • Use a higher bit rate if needed • Better quality • Result with a larger file size • Many MP3 players do not support VBR
Average Bit Rate • A little like VBR • Use the average bit rate of all the bit rates VBR would use • Dynamically change the for better quality • The end file size is known • Global quality is slightly lower than VBR
MP3 Competition • Window Media Format - CD quality at half the bit rate • Real Audio - Designed for lower bit-rate • Liquid Audio - implements more security so less appealing to users
Acknowledgements • Fraunhofer IIS: http://www.iis.fraunhofer.de/amm/techinf/layer3/index.html • Intel: http://www.intel.com/english/home/maximize/article/mp3/how/index.htm • Koning, Verhelst. ON PSYCHOACOUSTIC NOISE SHAPING FOR AUDIO REQUANTIZATION. Vrije Universiteit Brussel. • msdn.microsoft.com/library