270 likes | 695 Views
PAC/AAC audio coding standard. A. Moreno antonio@ece.gatech.edu Georgia Institute of Technology ECE8873-Spring/2004. Overview. Audio Recording Coding-ultimate goal AAC Encoder Block Diagram Principles of Psychoacoustics Perceptual Entropy Quantization and Coding Samples. Introduction.
E N D
PAC/AAC audio coding standard A. Morenoantonio@ece.gatech.eduGeorgia Institute of TechnologyECE8873-Spring/2004
Overview • Audio Recording • Coding-ultimate goal • AAC Encoder Block Diagram • Principles of Psychoacoustics • Perceptual Entropy • Quantization and Coding • Samples
Introduction "If a tree falls in the forest with no one around to hear it, does it make a sound?"
Audio Recording • Edison, 1877
Audio Recording • Philips, 1978 A/D Converter PCM
Coding • Ultimate Goal: reduce the number of bits needed to represent the data. Bitrate = Fsa x Wordlength
AAC Encoder Block Diagram Perceptual Model Iterative Rate Control Loop ScaleFactorExtract EntropyCoding s(n) Quant Gain Control Multi-ChannelM/S, Intensity MDCT TNS Prediction z^-1 Side information coding, Bitstream channel
Principles of Psychoacoustics • Source localization. Two ears are necessary. Brain uses intensity differences, and time delays between the two perceived signals.
audible inaudible Principles of Psychoacoustics Absolute Hearing Threshold
Principles of Psychoacoustics • Human Ear Loudness characteristic Robinson and Dadson equi-loudness contours.
Principles of Psychoacoustics • Critical BandsConcept introduced by Harvey Fletcher 1940. Frequency to Place Transform.Function of frequency that quantifies the cochlear filter passbands. Example: The critical band for a 1kHz is about 160Hz in width. A narrow band noise centered at 1kHz is perceived with the same loudness as long as the width < 160Hz.
audible inaudible Principles of Psychoacoustics • Simultaneous Masking: Frequency
Principles of Psychoacoustics Simplified Paradigms:Noise Masking Tone Tone Masking Noise THN 1Bark THT 1Bark K=3dB...5dB (constant)
th 1Bark Principles of Psychoacoustics Spread of Masking
Principles of Psychoacoustics • Masking: Temporal
Perceptual Entropy • Perceptual Entropy, objective metric of perceptually relevant introduced by J. Johnston The perceived information from an audio signal is only a fraction of the total information emanated by the source.
Perceptual Entropy • Procedure: • Window and transform to frequency. • Masking Threshold is computed using perceptual rules • A determination is made of the number of bits required to quantize the spectrum, without injecting perceptible noise.
Perceptual Entropy Determine nature(Noise-like)(Tone-like) ApplyThresholdingrules HannWindow s(n) MDCT Spectral Flatness Measure Coefficient of ‘Tonality’ Offset JND Estimates
Perceptual Entropy i: index of critical band; bli, blh: lower and upper bounds of band i; ki: number of transform component in band i; Ti: masking threshold in band i; nint: rounding to the nearest integer.
Returning • "If a tree falls in the forest with no one around to hear it, does it make a sound?" From a Perceptual Coding standpoint, if no one can hear it, THERE IS NO TREE.
AAC Encoder Block Diagram Perceptual Model Iterative Rate Control Loop ScaleFactorExtract EntropyCoding s(n) Quant Gain Control Multi-ChannelM/S, Intensity MDCT TNS Prediction z^-1 Side information coding, Bitstream channel
Quantization and Coding • Power-law quantizer • Huffman Coding (table can be chosen) • Global Gain -> Quantization step size • Scale Factors -> noise shaping factor
Quantization and Coding while NOISE_CTL while FINDING_RATE Nr_bits= get_bits_needed(); if (Nr_bits > max_bits) adjust_global_gain(); else FINDING_RATE=0; end q_noise=get_quant_noise_level(); if (q_noise> Th(band)) adjust_band_scale_factor(); else NOISE_CTL=0; end
Samples 128kbps AAC Stereo (48kHz) Original 48kHz Stereo Castanets Piano Timpani
References [1] Ted Painter and Andreas Spanias. Perceptual coding of digital audio. Proceedings of the IEEE, 88(4):449-513. Abril 2000. [2] Karlheinz Brandenburg, MP3 and AAC explained, AES 17th International Conference on High Quality Audio Coding, 1999. [3] J.D. Johnston, A.J. Ferreira, Sum-Difference Stereo Transform Coding, Proc. ICASSP 1992. [4] Deepen Sinha, James D. Johnston. Audio Compression at low bit rates using a Signal Adaptive switched Filterbank. Proc. of the ICASSP 1996, pp. 1053-1056 .