260 likes | 482 Views
Introduction to MPEG Surround. 韓志岡 2/9/2005. Outline. Background Motivation Perception of sound in space Pricicple of MPEG Surround Downmixing to one channel Estimation of spatial cues Synthesis of spatial cues Conclusions & Reference. Motivation.
E N D
Introduction to MPEG Surround 韓志岡 2/9/2005
Outline • Background • Motivation • Perception of sound in space • Pricicple of MPEG Surround • Downmixing to one channel • Estimation of spatial cues • Synthesis of spatial cues • Conclusions & Reference
Motivation • The vast majority of audio playback equipment use traditional two-channel presentations (stereo) • More reproduction channels (“multi-channel audio” or “surround sound”) is quite visible in the market place • A non-disruptive transition from stereo to multi-channel audio requires media formats that can serve both those using conventional stereo equipment and those using next-generation multi-channel equipment.
Perception of sound in space • HRTF(Head Related Transfer Function) modeling the path of sound from a source to the left and right ear entrances.
Perception of sound in space(cont.) • Three parameters(cues) describing how human localize sound in the horizontal plane: • Interaural level difference (ILD) • Interaural time difference (ITD) • Interaural coherence (IC)
ITD (Interaural time difference) & ILD (Interaural level difference)
ITD (Interaural time difference) & ILD (Interaural level difference) (cont.) • ITD and ILD between a pair of headphone signals determine the location of the auditory event which appears in the frontal section of the upper head.
IC (Interaural coherence) • The spatial impression of the auditory enent is related to IC
Two sound source: Summing localization • Inter-channel time difference (ICTD) • Inter-channel level difference (ICLD) • Inter-channel coherence (ICC)
MPEG Surround • MPEG Surround exploits inter-channel differences in level, phase and coherence equivalent to the ILD, ITD and IC cues to capture the spatial image of a multi-channel audio signal • Downmix signal and encodes these cues in a very compact form such that the cues and the transmitted signal can be decoded to synthesize a high quality multi-channel representation. • Provide backward compatibility with stereo/mono audio systems.
Downmixing to one channel (1/2) • The sum signal is generated by adding the input channels in a subband domain • Multiplying the sum with a factor in order to preserve signal power
Estimation of spatial cues (1/4) • The spatial cues, ICTD, ICLD, and ICC are estimated in a subband domain. The spatial cue estimation is applied independently to each subband
Estimation of spatial cues(2/4) • ICTD (samples):with a short-time estimate of normalized cross-correlation functionwhere and is a short-time estimate of the mean of
Estimation of spatial cues(3/4) • ICLD (dB): • ICC :
Estimation of spatial cues(4/4) • For multi-channel audio signals, ICTD and ICLD are defined between the reference channel and each other C-1 channels
Synthesis of spatial cues(1/3) • ICTD are synthesized by imposing delays, ICLD by scaling, and ICC by applying de-correlation filters.
Synthesis of spatial cues(2/3) • The delays are determined by the ICTDs
Synthesis of spatial cues(3/3) • The scale factors are determined by the ICLDs satisfying: • After delays and scaling, we need to reduce correlation between the subbands.This is achieved by designing the filters hc controlled as a function of ICC.
Conclusions (1/2) • Well-known perceptual audio coders, such as MP3, primarily exploit a single channel’s ability to mask its own quantization noise. • In contrast, spatial perception is primarily attributed to three parameters : ILD, ITD, and IC.
Conclusions (2/2) • MPEG Surround provides an extremely efficient method for coding of multi-channel sound via the transmission of a compressed stereo (or even mono) audio program plus a low-rate side-information channel. • MPEG Surround is the latest technology for bitrate efficient and backward compatible presentation of multi-channel audio.
Reference • ISO/IEC JTC1/SC29/WG11 (MPEG), Document N7390, “Tutorial on MPEG Surround Audio Coding”, July 2005, Poznan, Poland • C. Faller, “Parametric coding of spatial audio,” in Proc. DAFx (Digital Audio Effects), October 2004.