270 likes | 444 Views
Digital Audio Compression. Formats. There are many different formats for storing and communicating digital audio: CD audio Wav Aiff Au MP3. The Storage Problem. CD quality recording 44100 sampling rate 16 bit quantization 2 channels (stereo) 176.4 Kbytes per second
E N D
Formats • There are many different formats for storing and communicating digital audio: • CD audio • Wav • Aiff • Au • MP3
The Storage Problem • CD quality recording • 44100 sampling rate • 16 bit quantization • 2 channels (stereo) • 176.4 Kbytes per second • 1 minute is ~ 10.5 MBytes • 74 minutes is ~780 MB
Psychoacoustics • The study of the psychological and physiological principles of sound perception • CDs try to accurately reproduce the original audio signal • But we do not hear all of this signal • The parts that we don’t hear are redundant • If we remove these parts we can store the signal using less data but without effecting the perceived sound
Threshold of Hearing & Masking • The threshold of hearing curve describes the minimum level at which the ear can detect a tone at a given frequency Fletcher-Munson curves
Amplitude Masking • Amplitude masking occurs when a tone shifts the threshold curve upwards in the frequency region that surrounds it 0.
Critical Band • Hair cells on the Basilar membrane respond to the strongest stimulation in their local region • This local region is called the critical band • Critical bands are smaller for low frequency signals than they are for high frequency signals
Temporal Masking • Masking can also occur when tones are sounded at slightly different times • Premasking – signal A is masked by signal B which occurs later • Postmaking – signal A is masked by signal B which ends before signal A has started • Temporal masking increases as time differences reduce
Masking • Amplitude and temporal masking form a masking area in the time-frequency domain
Perceptual Coding • Perceptual coders analyse the frequency and amplitude content of the input signal and compare it to a model of human auditory perception • Parts of the input signal which are inaudible are removed
Perceptual Coding • A perceptual coder uses a digital filter bank to split a short duration of audio signal into multiple frequency bands
Perceptual Coding • The coder analyses the energy in each of these subbands to determine which subbands contain audible information • Subbands which are not audible are not coded
Perceptual Coding • Quantization bits are assigned according to signal strength above the audibility curve
Perceptual Coding • The purpose of perceptual coding is to reduce the data rate • Perceptual coders maintain sampling frequency, selectively decrease word length • Coders reduction ratio is the ratio of input bit rate to output bit rate • Ratios of up to 6:1 are often transparent
Perceptual Coding • Because the inaudible content of the signal is removed the playback system’s ability to convey audible music should improve • In theory it is possible to get better reproduction after perceptual coding than the original! (In theory…) • Perceptual coders more properly code an audio signal for passage through an audio system
MP3 • Mpeg 1 Audio Layer 3 • Developed to support audio coding for playback with video • Uses : • A filterbank producing 32 subbands from 24ms of audio data • Perceptual coder originally produced by the Fraunhofer Institut Integrierte Schaltungen • Lossless Huffman coding
MP3 • Sound quality is highly dependent on the performance of the encoder • Most encoders use constant-bitrate (CBR) encoding. In this mode you choose a target bitrate (e.g. 128kBit/s) • Codecs • Fraunhofer • Xing MP3 encoder • Etc…
Joint Stereo Coding • Takes advantage of interchannel redundancy between stereo channels • Some sounds and some components are equal in both channels • Low frequencies: Bass instruments, strings, low components of drums • Centrally placed signals: typically vocals • Removing duplication reduces data without effecting perceived sound
Fin Fin