Issac Garcia-Munoz Senior Thesis Electrical Engineering Advisor: Pietro Perona

Issac Garcia-Munoz Senior Thesis Electrical Engineering Advisor: Pietro Perona

Outline • Background • What is 3D audio? • Acoustic Environment Modeling • Limitations • Reconstructing Audio • Audio for Dummies = KEMAR • Crosstalk Cancellation • ChucK • On-the-fly Audio Programming Language • Examples and Proposed 3D Audio Integration • Head-Tracking • Results from “Head-Tracked 3D Audio” • Planned methods of tracking

What is 3D Audio? • 3D audio systems try to replicate natural hearing by reproducing sound localization cues at the ears of the listener. • Spatial Cues • Time delay • Amplitude difference • Tonal information

Binaural Synthesis • Head Related Transfer Functions (HRTFs) • Transformation of sound from a source to the ear canal. • Contain the spatial localization cues. • Binaural synthesis • HRTFs as digital audio filters • Filters applied to an audio signal and listened to with headphones.

Sound Replication through Loudspeakers • Right and left channels should remain separate. • Crosstalk: when the left speaker transmits to the right ear and vice versa.

Crosstalk Cancellation • Digital filter used to eliminate crosstalk. • Requirements for loudspeaker listening: • Centered • Facing forward • Models: • Delay and attenuation • Spherical head model • HRTF measurements • Improves channel separation 20dB 100Hz-6kHz

Limitations • Uniqueness of pinna (outer ear) • Notable differences in tonal transformation at higher frequencies. • Localization errors • Front/back confusions • Elevation errors

Acoustic Environment Modeling • Reverb • Reflected energy • Reverberation time: time for the reverb to decay 60dB • Spatial information about the space • Distance • Loudness of direct path to that of reverberation • Doubling distance to sound source decreases • 6dB of the direct sound • Less for the reverberation • Doppler • Change in pitch relative to motion • Motion effects • Air Absorption • Higher frequencies are absorbed thus attenuated • <1dB for <10m • Diffraction • For sources hidden behind obstacles • Frequencies with wavelengths shorter than obstacle are greatly attenuated

Sound Reconstruction Tool: KEMAR • Knowles Electronic Manikin for Acoustic Research • Anthromorphic manikin • Human molded pinna • 710 measurements with the microphone inside the ear canal lead to HRTFs

Transaural Audio • Method of sending binaural signals to the listener using stereo loudspeakers. • System: • To get binaural signal x, simply invert H • Where the inverse is:

ChucK • An on-the-fly audio programming language • Compiled into virtual instructions immediately run in the ChucK Virtual Machine. • Stereo unit generators • parameters .left and .right • access to the individual channels • pan2 stereo object • takes a mono signal and split it to a stereo signal, • control over the panning • .pan parameter (-1 (hard left) <= p <= 1 (hard right))

// white noise to pan to dac SinOsc n => pan2 p => dac; 500 => n.freq; // infinite time loop while( true ) { // modulate the pan Math.sin( now / 1::second* 2 * pi ) => p.pan; // advance time 30::ms => now; } Then applying a 3D DirectX plug-in (Panorama by Wave Arts, Inc.) Changes the panning from straight ahead to the side. This had to be done after the recording was made, not part of the ChucK language. ChucK Stereo Example

ChucK 3D Effects Implementation • Delay, reverb and various filters available. • Will use to create spatial cues in conjunction with HRTFs • Develop any other code necessary to build • .distance • .azimuth • .elevation • Goal to change/set spatial position in real-time instead of using post processing plug-ins

Gardner “Head Tracked 3-D Audio Using Loudspeakers” • System to steer the equalization zone • Improves Horizontal localization when head is laterally displaced or rotated. • Steerable crosstalk canceller uses different strategies for low and high frequencies. • Difficult to synthesize images on one side of the head when both loudspeakers are on the opposite side • Problem of inverting the high frequency transmission paths. • Solutions: use wider spaced loudspeakers or additional rear loudspeakers.

Proposed Method of Head Tracking • Real-time feedback of spatial cues • Through microphones placed near the ears • Looking into bluetooth headsets • Need to set up soundcard to accept two channel bluetooth audio headset service • Facial recognition to determine distance from equalization ‘sweet spot’ • Allow more freedom of movement without disturbing sound image.

References • Blauert, Jens. Spatial Hearing: the Psychophysics of Human Sound Localization. MIT Press. Cambridge , Mass. 1983. • Fay, Richard. Popper, Arthur. Sound Source Localization. Springer Science, NY. 2005. • Edwards, Brent. “Application of Psychoacoustics to Audio Signal Processing.” IEEE. 2001. • Institute of Sound Vibration Research (ISVR). Virtual Acoustics Project. http://www.isvr.soton.ac.uk/FDAG/VAP/index.htm • Gardner, W. G. “Transaural 3D Audio.” MIT Media Lab Report No. 342. July 6, 1995. • Gardner, W. G. “3D Audio and Acoustic Environment Modeling.” Wave Arts, Inc. March 15, 1999. • Gardner, W. G. “Head Tracked 3D-Audio Using Loudspeakers.” Machine Listening Group.

Issac Garcia-Munoz Senior Thesis Electrical Engineering Advisor: Pietro Perona