550 likes | 564 Views
3D SOUND EFFECT ANALYSIS, SYNTHESIS AND APPLICATION DESIGN -A PRIMARY-AMBIENT EXTRACTION (PAE) APPROACH. Qualification Examination for PhD HE Jianjun 2nd July 2013. Email: JHE007@e.ntu.edu.sg. Outline. 1. Introduction 2. Stereo Signal Model 3. Linear Estimation Model
E N D
3D SOUND EFFECT ANALYSIS, SYNTHESIS AND APPLICATION DESIGN-A PRIMARY-AMBIENT EXTRACTION (PAE) APPROACH Qualification Examination for PhD HE Jianjun 2nd July 2013 Email: JHE007@e.ntu.edu.sg
Outline • 1. Introduction • 2. Stereo Signal Model • 3. Linear Estimation Model • 4. PAE Based on Linear Estimation • 5. Conclusions and Future Work
Introduction Primary component — Where the sound comes from? Ambient component — Where are you?
Introduction Post-production PAE Spatial Audio Coding ? Stereo ? ?
Introduction – PAE based Spatial Audio System
Introduction – Input and Output of PAE
Stereo Signal Model Signal = Primary + Ambient Assumptions
Stereo Signal Model Affect the extraction results !!! Center Right Left 1/10 1 10 k
Linear estimation framework in PAE Signal = Primary + Ambient
Linear estimation framework in PAE – Performance Measures for Primary Components i = Left or Right channel
Linear estimation framework in PAE – Performance Measures for Ambient Components i ≠j = Left or Right channel
Linear estimation framework in PAE – Performance measures
Outline • 1. Introduction • 2. Stereo Signal Model • 3. Linear Estimation Model • 4. PAE Based on Linear Estimation • PCA : Principal Component Analysis • LS : Least Squares • MLLS: Minimum Leakage LS • 5. Conclusions and Future Work
PAE using Principal component analysis (PCA) M. Goodwin and J. M. Jot, “Primary-ambient signal decomposition and vector-based localization for spatial audio coding and enhancement,” IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 2007.
Results of PCA Primary component extraction Primary component completely extracted Ambient component extraction No primary component
PAE using Least squares (LS) Christof Faller, “Multiple-Loudspeaker Playback of Stereo Signals”, J. Audio Eng. Soc. Vol. 54, No. 11, pp. 1051-1064, Nov. 2006.
Results of LS Primary component extraction Primary component NOT completely extracted Ambient component extraction Primary leakage found
Results of MLLS Primary component extraction Primary component NOT completely extracted Ambient component extraction No primary leakage
PAE based on Linear estimation – different objectives
PAE based on Linear estimation – the weighting matrix W Note: all entries in the matrices above need to multiply
PAE based on Linear estimation – the Weighting Matrix W for Primary Components Primary component extraction using PCA and LS (MLLS) isequivalent up to a scaling factor difference.
PAE based on Linear estimation – Scaling factor in primary component extraction between PCA and LS Primary component Difference of PCA and LS in primary component extraction. Scaling factor
PAE based on Linear estimation – Performance of the three approaches 0 0 0 ETSC 1 ICC(ICTD) 1(0) ICLD
PAE based on Linear estimation – Comparison of the three approaches
Conclusions Formulated the linear estimation framework for PAE. • Introduced an objective evaluation system with three groups of performance measures in PAE. • Extraction error: ESR, DSR, ISR, LSR • Extraction similarity: ETSC • Spatial accuracy: ICC, ICTD, ICLD • Proposed MLLS, and compared them with PCA and LS in PAE. • Primary component extraction • PCA: minimum distortion • LS=MLLS: minimum leakage & MSE • Ambient component extraction • PCA (=MLLS), LS minimize the primary leakage and MSE, respectively Different approaches are preferred in different applications. : a scaling factor difference
Future Work Signal model Mismatch Input signal Generalizing Or Better performance Detection & Classification When PPR is small, the performance of primary component extraction using PCA is not good. Other approaches like LS preferred!
References • D. S. Brungart, 3D sound for virtual reality and multimedia, Academic Press Professional, Cambridge, MA, USA, 2000. • J. Blauert, Spatial hearing: the psychophysics of human sound localization. Cambridge, MA: MIT Press, 1997. • J. Breebaart and E. Schuijers, “Phantom materialization: a novel method to enhance stereo audio reproduction on headphones,” IEEE Trans. on audio, speech and language process., vol.16, no. 8, Nov. 2008. • M. M. Goodwin and J. M. Jot, “Primary-ambient signal decomposition and vector-based localization for spatial audio coding and enhancement,” in IEEE Int. Conf. on Acoust., Speech, and Signal Process., Hawaii, Apr. 2007. • F. Menzer and C. Faller, “Stereo-to-binaural conversion using interaural coherence matching”, in 128th Audio Eng. Soc. Conv., London, UK, May. 2010. • J. Breebaart and C. Faller, Spatial audio processing: MPEG surround and other applications. Chichester, UK: John Wiley & Sons, 2007. • V. Pulkki, “Spatial sound reproduction with directional audio coding,” J. Audio Eng. Soc., vol. 55, no. 6, pp. 503-516, Jun. 2007. • M. M. Goodwin and J. M. Jot, “Binaural 3-D audio rendering based on spatial audio scene coding,” in 123rd Audio Eng. Soc. Conv., New York, Oct. 2007. • W. S. Gan, E. L. Tan, and S. M. Kuo, “Audio projection: directional sound and its application in immersive communication,” IEEE Signal Process. Mag., vol. 28, no. 1, pp. 43-57, Jan. 2011. • J. He, E. L. Tan, and W. S. Gan, “Time-shifted principal component analysis based cue extraction for stereo audio signals,” in IEEE Int. Conf. on Acoust., Speech, and Signal Process.,Vancouver, Canada, May 2013. • C. Faller, “Multiple-loudspeaker playback of stereo signals”, J. Audio Eng. Soc., vol. 54, no. 11, pp. 1051-1064, Nov. 2006. • A. Jeffress, “A place theory of sound localization,” Journal of Comparative and Physiological Psychology, vol. 41, no. 1, pp. 35-39, Feb. 1948. • E. Vincent, R. Gribonval and C. Févotte, “Performance measurement in blind audio source separation” IEEE Tran. Audio, Speech Lang. Process., vol. 14, no. 4, pp. 1462-1469, Jul. 2006. • L. Lu, H. Zhang, and H. Jiang, “Content analysis for audio classification and segmentation,” IEEE Tran. Audio, Speech Lang. Process., vol. 10, no. 7, pp. 504-516, Oct. 2002. • http://www.illusonic.com/immersive-audio-processor/setups/
3D SOUND EFFECT ANALYSIS, SYNTHESIS AND APPLICATION DESIGN-A PRIMARY-AMBIENT EXTRACTION (PAE) APPROACH Email: JHE007@e.ntu.edu.sg
Introduction – PAE based Spatial Audio System • Binaural rendering using Head Related Transfer Function (HRTF) • Localization inaccuracy • Limited Externalization • Binaural rendering using Binaural room impulse response (BRIR) • Improved Externalization • Over-coloration 3D The problem is using the same way to render primary and ambient components.
Introduction – PAE based Spatial Audio System PAE
Introduction – PAE based Spatial Audio System
Results of MDLS Primary component extraction Ambient component extraction
PCA based PAE Problems remains with Practically, • Localization parameters: • Inter-channel time difference (ITD) • Inter-channel level difference (ILD) Error ICTD ≡ 0 Performance of PCA based PAE with varying (k=3). (a) ESR; (b)-(c) extraction similarity; (d) ICLD error
Outline • 1. Introduction • 2. Stereo Signal Model • 3. Linear Estimation Model • 4. PAE Based on Linear Estimation • 5. PAE in Primary-complex Cases • 6. Conclusions and Future Work 45
PCA based PAE Practically, 46
PCA based PAE Problems remains with Practically, Error 47
Problems remains with PCA based PAE So what can we do? 48
Shifted PCA based PAE Primary and ambient components Time Shifting PCA Decomposition ICTD Estimation Output Mapping Stereo input signal 49
Performance Comparison between PCA and SPCA • Synthesized signals: • Primary components: speech amplitude panned by 3 and shifted by 40 time units • Ambient components: uncorrelated white Gaussian noise • PPF = 3 • PPR: (0, 1) • Synthesized signals: • Primary components: speech amplitude panned • by 3 and shifted by 40 time units • Ambient components: uncorrelated white Gaussian noise • PPF =3 • PPR: (0, 1) 50