Sound spatialization: i nnovation, adoption and evolution. Dr Ed Wright

Sound spatialization:innovation, adoption and evolution.Dr Ed Wright

Basic Principals • Precedence effect - ears take note of phase differences (due to time) of incoming signals a sound hard left will reach the left ear well before the right. • Intensity and spectral based localisation - your head gets in the way! so sounds will sound quieter and have a different spectral content when off axis. • Depth - loudness, air absorption of HF, more reverberant, less difference between direct sound and first reflection. • Interaction between hearing and perception - Many sounds are localised pycholoicallyeg, planes are up, a very crisp sound at the same time as a very reverberant one sounds odd. In many cases eg precedence vs. intensity the brain will add up for and against and come to a conclusion.

Basic technology • We need some form of transducer to convert the recording into sound waves. This could be from analogue motion e.g a gramophone disk or from voltage change from an ADC. We need to realize the stored data. • Data has been stored in a number of ways over time both in analogue and digital formats. • In combination with the technology this gives a simple chain of: Data →Interpretation → Realization

Blumlein's 1931 patent • In 1931 Blumlein's patent showed how a sound could be reproduced in stereo and phantom images produced by simply varying the amplitude of the signal. • This basic idea paved the way for most mainstream multi-speaker arrays that are in use today.

In 1934, Blumlein recorded Mozart's Jupiter Symphony conducted by Sir Thomas Beecham at the Abbey Road Studio using his vertical-lateral technique. In the United States, Harvey Fletcher of Bell Laboratories was also investigating techniques for stereophonic recording and reproduction. One of the techniques investigated was the "wall of sound", which used an enormous array of microphones hung in a line across the front of an orchestra. Up to 80 microphones were used, and each fed a corresponding loudspeaker, placed in an identical position, in a separate listening room. Several stereophonic test recordings, using two microphones connected to two styli cutting two separate grooves on the same wax disc, were made with Leopold Stokowski and the Philadelphia Orchestra at Philadelphia's Academy of Music in March 1932. The first (made on March 12, 1932), of Scriabin's Prometheus: Poem of Fire, is the earliest known surviving intentional stereo recording.

1941 Fantasia • In many ways, Walt Disney can be considered the inventor of surround sound. When the groundbreaking movie Fantasia was still in development, Disney met with conductor Leopold Stokowski to discuss the film's classical music score. Stokowski suggested that Disney contact the engineers at Bell Labs, who were working on a nascent multiple-microphone stereo recording technology. Intrigued by the technology, Disney thought it would be wonderful if, during the movie's "Flight of the Bumblebee" segment, the musical sound of the bumblebee could be heard flying all around the audience, not just in front of them. • These installations cost $85,000 apiece and included 54 speakers placed throughout the auditorium. Disney also produced two scaled-back road show versions of the Fantasound system, at $45,000 each, although these traveling versions didn't include the surround speakers.

The 1950s: Multiple-Channel Stereo • The Fantasound system proved too complex and too costly for other studios to adopt. It took another decade for Hollywood to embrace a more affordable multiple-channel sound technology—which it did in conjunction with the birth of the new widescreen film formats, in particular Cinerama and CinemaScope • The new wider-screen formats needed multiple channels to fill the entire front field of the movie screen. The solution was a wall of sound, employing four to six audio channels for a widescreen stereo effect. Among the earliest multiple-channel movies were 1952's This is Cinerama (with seven-channel surround sound) and 1953's The Robe (in Cinemascope, with four-channel sound). • Unfortunately, the cost of these technologies proved prohibitive, and theatrical surround sound died out by the end of the 1950s.

Quadraphonic Sound • The desire to reproduce a 360-degree sound field led to the development of four-channel, or quadraphonic, recording. Quadraphonic sound officially debuted in 1969 with the release of the first consumer-level four-channel reel-to-reel tape deck. Soon the quadraphonic process was being applied to both eight-track tapes and vinyl records. • By the early 1970s multiple quadraphonic technologies were competing in the marketplace. JVC's CD-4 system, introduced in 1971 for vinyl records, employed four discrete channels of audio information—front left, front right, rear left, and rear right. The SQ and QS systems, introduced in 1972 by CBS and Sansui, respectively, were both matrix technologies for vinyl records, in which the rear channel information was matrixed into the two front channels and then separated out by a surround decoder. And RCA's Quad-8 format, introduced in 1970, was a discrete format designed specifically for eight-track tape players. • Unfortunately, the confusion generated by these competing technologies, along with the high cost of four-channel amplifiers and additional rear speakers, led to the abandonment of quadraphonic sound by the end of the decade. • Ps Quadrophenia never orginially appeared in quad because the early mixes were unsatisfactory, it was only re-released recently!

5.1 • Cinema surround sound became digital in 1992 with the introduction of Dolby Digital Surround (originally known as Dolby AC-3, short for audio coding 3). This system still encodes the audio information on a film's optical tracks, but in digital format instead of in analog fashion. Dolby Digital also introduced more channels—a 5.1 configuration with left front, center front, right front, left surround, right surround, and a separate low frequency effects (LFE) channel for deep bass. • In 1993, the Digital TheaterSystems company introduced the competing DTS surround sound technology. Like Dolby Digital, DTS is a digital 5.1-channel system. Unlike Dolby Digital, however, DTS records the audio channels on CD, which on playback is synchronized to the film's time code. The first DTS movie was 1993's Jurassic Park.

Larger Arrays and Diffusion

Binaural recording • The history of binaural recording goes back to 1881. The first binaural unit, the Théâtrophone, was an array of carbon telephone microphones installed along the front edge of the Opera Garnier. The signal was sent to subscribers through the telephone system, and required that they wear a special head set, which had a tiny speaker for each ear. • The novelty wore off, and there wasn't significant interest in the technology until around forty years later when a Connecticut radio station began to broadcast binaural shows. Stereo radio had not yet been implemented, so the station actually broadcast the left channel on one frequency and the right channel on a second. Listeners would then have to own two radios, and plug the right and left ear pieces of their head sets into each radio. Naturally, the expense of owning two radios was, at the time, too much for a broad audience, and again binaural faded into obscurity.

The modern era has seen a resurgence of interest in binaural, specifically within the audiophile community, partially due to the widespread availability of headphones, and cheaper methods of recording. A small grassroots movement of people building their own recording sets and swapping them on the Internet has joined the very few CDs available for purchase. • Some of my work mixed binauraly is available athttps://soundcloud.com/virtual440/jackdaws-revised-binaural-mix

Wave Field Synthesis • Wave Field Synthesis (WFS) is based on a series of simplifications of the previous principle. The first work to have been published on the subject dates back to 1988 and is attributed to Professor A.J. Berkhout of the acoustics and seismology team of the Technological University of Delft (T.U.D.) in Holland. This research was continued throughout the 90's by the T.U.D. as well as by the Research and Development department of France Telecom Lannion.

Content-Coding • WFS relies on an object-based description of a sound scene. To obtain an object-based description, one must decompose the sound scene into a finite number of sources interacting with an acoustical environment. The coding of such a sound scene includes the description of the acoustic properties of the room and the sound sources (including their positions and radiation characteristics). Separate from this spatial sound scene description is the coding of the sound stream itself (encoding of the sound produced by each source). The MPEG-4 format provides an object-based sound scene description that is compatible with WFS reproduction. • WFS reproduction • Work conducted on the subject of Wave Field Synthesis has allowed for a very simple formulation of the reproduction of omni-directional virtual sources using a linear loudspeaker array. The driving signals for the loudspeakers composing the array appear as delayed and attenuated versions of a unique filtered signal. The maximum spacing between two adjacent loudspeakers is approximately 15 to 20 cm. This allows for optimal localization over the entire span of the listening area.

Elementary sources in Wave Field Synthesis • One can distinguish three separate types of virtual sources that are synthesizable using WFS systems: • Virtual point sources situated behind the loudspeaker array. This type of source is perceived by any listener situated inside of the sound installation as emitting sound from a fixed position. The position remains stable for a single listener moving around inside of the installation. • A linear loudspeaker array can synthesize a sound field associated to multiple sound sources simultaneously • Plane Waves. These sound sources are produced by placing a virtual point source at a seemingly "infinite" distance behind the loudspeakers (i.e. at a very large distance in comparison to the size of the listening room). Such sources have no acoustical equivalent in the "real world". However, the sun is a good illustration of the plane wave phenomenon in the visual domain. When travelling inside a car or train, one can entertain the impression that the sun is "following" the train while the landscape streams along at high speeds. The sensation of being "followed" by an object that retains the same angular direction while one moves around inside of the listening area accurately describes the effect of a plane wave. • Virtual point sources situated in front of the loudspeaker array. An extension of the WFS principle allows the synthesis of sources within the listening area at positions where no physical sources are actually present. These "sound holograms" are created when a wave front created by the loudspeaker array converges onto a fixed position inside of the listening room. The wave front is then naturally re-emitted from the target position to the rest of the listening area. The sound field is therefore inaccurate between the loudspeaker array and the target position but perfectly valid beyond it

Ambisonics • Ambisonics is a full-sphere surround sound technique: in addition to the horizontal plane, it covers sound sources above and below the listener. • Unlike other multichannel surround formats, its transmission channels do not carry speaker signals. Instead, they contain a speaker-independent representation of a sound field called B-format, which is then decoded to the listener's speaker setup. This extra step allows the producer to think in terms of source directions rather than loudspeaker positions, and offers the listener a considerable degree of flexibility as to the layout and number of speakers used for playback

Ambisonics can be understood as a three-dimensional extension of M/S (mid/side) stereo, adding additional difference channels for height and depth. The resulting signal set is called B-format. Its component channels are labelled for the sound pressure (the M in M/S), for the front-minus-back sound pressure gradient, for left-minus-right (the S in M/S) and for up-minus-downThe signal corresponds to an omnidirectional microphone, whereas are the components that would be picked up by figure-of-eight capsules oriented along the three spatial axes.

Short task Please read the following paper: • http://orbit.dtu.dk/fedora/objects/orbit:11808/datastreams/file_e935c831-3f20-4766-a547-d7e5ca067c54/content These will be discussed afterwards, topics for the discussion will include: • What do you understand by the term SST? • Why does an SST approach sometimes aim to disregard ‘social and environmental impacts’. In which ways can this be viewed as a positive step? • How can ANT be seen as a way of driving the commercialisation of innovation? • Is the concept of Micro/Meso/Macro level development consistent with an SST approach? • How can this text be applied to our understanding of surround sound systems?

Sound spatialization: i nnovation, adoption and evolution. Dr Ed Wright