Virtualized Audio as a Distributed Interactive Application

Virtualized Audioas aDistributed Interactive Application Peter A. Dinda Northwestern University Access Grid Retreat, 1/30/01

Overview • Audio systems are pathetic and stagnant • We can do better: Virtualized Audio (VA) • VA can exploit distributed environments • VA demands interactive response What I believe Why I care

Traditional Audio (TA) System Performance Room Listening Room Performer Amp Loudspeakers Sound Field 1 Sound Field 2 Mixer Microphones Listener Headphones

TA Mixing And Filtering Perception of Headphone Reproduced Sound Listener’s Location and HRTF Perception of Real Sound Performance Room Filter Microphone Sampling Mixing (reduction) Headphones Performer Perception of LoudspeakerReproduced Sound Listener’s Location and HRTF Loudspeaker Filter Listening Room Filter Amp Filter

Virtualized Audio (VA) System

VA: Filtering, Separation, and Auralization VA Forward Problem VA Reverse Problem

The Reverse Problem -Source Separation other inputs microphone signals sound source positions Recovery Algorithms sound source signals room geometry and properties microphone positions Human Space Microphones “Reverse Problem” • Microphone signals are a result of sound source signals, positions, microphone positions, and the geometry and material properties of the room. • We seek to recover these underlying producers of the microphone signals.

The Reverse Problem • Blind source separation and deconvolution • Statistical estimation problem • Can “unblind” problem in various ways • Large number of microphones • Tracking of performers • Separate out room deconvolution from source location • Directional microphones • Phased arrays Potential to trade off computational requirements and specialized equipment Much existing research to be exploited

Transducer Beaming l >> L l > L l = L l < L Wave l << L L Transducer l

Phased Arrays of Transducers Physical Equivalent Phased Array

The Forward Problem - Auralization sound source positions Auralization Algorithms sound source signals Listener signals room geometry/properties Listener positions Listener wearing Headphones (or HSS scheme) • In general, all inputs are a function of time • Auralization must proceed in real-time

Ray-based Approaches To Auralization • For each sound source, cast some number of rays, then collect rays that intersect listener positions • Geometrical simplification for rectangular spaces and specular reflections • Problems • Non-specular reflections requires exponential growth in number of rays to simulate • Most interesting spaces are not rectangular

Wave Propagation Approach ¶2p/¶2t = ¶2p/¶2x + ¶2p/¶2y + ¶2p/¶2z • Captures all properties except absorption • absorption adds 1st partial terms

Method of Finite Differences • Replace differentials with differences • Solve on a regular grid • Simple stencil computation (2D Ex. in Fx) • Do it really fast pdo i=2,Y-1 pdo j=2,X-1 workarray(m0,j,i) = (.99) * ( $ R*temparray(j+1,i) $ + 2.0*(1-2.0*R)*temparray(j,i) $ + R*temparray(j-1,i) $ + R*temparray(j,i+1) $ + R*temparray(j,i-1) $ - workarray(m1,j,i) ) endpdo endpdo

How Fast is Really Fast? • O(xyz(kf)4 / c3) stencil operations per second are necessary • f=maximum frequency to be resolved • x,y,z=dimensions of simulated space • k=grid points per wavelength (2..10 typical) • c=speed of sound in medium • for air, k=2, f=20 KHz, x=y=z=4m, need to perform 4.1 x 1012 stencil operations per second (~30 FP operations each)

LTI Simplification • Consider the system as LTI - Linear and Time-Invariant • We can characterize an LTI system by its impulse response h(t) • In particular, for this system there is an impulse response from each sound source i to each listener j: h(i,j,t) • Then for sound sources si (t), the output mj(t) listener j hears is mj (t) = Sih(i,j,t) * si(t), where * is the convolution operator

LTI Complications • Note that h(i,j) must be recomputed whenever space properties or signal source positions change • The system is not really LTI • Moving sound source - no Doppler effect • Provided sound source and listener movements, and space property changes are slow, approximation should be close, though. • Possible “virtual source” extension

Where do h(i,j,t)’s come from? • Instead of using input signals as boundary conditions to wave propagation simulation, use impulses (Dirac deltas) • Only run simulation when an h(i,j,t) needs to be recomputed due to movement or change in space properties.

Exploiting a Remote Supercomputer or the Grid

Interactivity in the Forward Problem sound source positions Auralization Algorithms sound source signals Listener signals room geometry/properties Listener positions Listener wearing headphones

sound source positions Auralization Algorithms sound source signals room geometry/properties Full Example of Virtualized Audio other inputs microphone signals sound source positions Recovery Algorithms sound source signals room geometry and properties microphone positions Human Space Microphones “Reverse Problem” other inputs microphone signals sound source positions Recovery Algorithms Combine sound source signals room geometry and properties microphone positions Human Space Microphones “Reverse Problem” other inputs microphone signals sound source positions Recovery Algorithms sound source signals room geometry and properties microphone positions Human Space Microphones “Reverse Problem”

VA as a Distributed Interactive Application • Disparate resource requirements • Low latency audio input/output • Massive computation requirements • Low latency control loop with human in the loop • Response time must be bounded • Adaptation mechanisms • Choice between full simulation and LTI simplification • number of listeners • Frequency limiting versus delay • Truncation of impulse responses • Spatial resolution of impulse response functions

Conclusion • We can and should do better than the current state of audio • Lots of existing research to exploit • The basis of virtualized audio • Trade off computation and specialized hardware • VA is a distributed interactive application VA forward problem currently being implemented at Northwestern

Virtualized Audio as a Distributed Interactive Application