810 likes | 1.08k Views
The Physics and Psycho-Acoustics of Surround Recording Part 2. David Griesinger Lexicon dgriesinger@lexicon.com www.world.std.com/~griesngr. Introduction. We all know how to make a good recording We need good music A very good performance
E N D
The Physics and Psycho-Acoustics of Surround Recording Part 2 David Griesinger Lexicon dgriesinger@lexicon.com www.world.std.com/~griesngr
Introduction • We all know how to make a good recording • We need good music • A very good performance • And satisfactory balance between the solos and the instruments. • But we want to make a great recording • How do we do it? • How do we know when a recording is great? • We must learn how to hear the technical quality of a great recording, • And learn how to achieve the best result. • The talk is based on classical music – but the techniques and perceptions apply to all recordings.
The recording space is very important! • It is much easier to achieve a great result in a large hall. • But large halls with great acoustics are rare. • Our job is to make a great result in the hall we have available (usually small). • This talk will tell you how to do it. • And help you hear the difference. • We will not talk about issues such as instrumental balance • or the differences between microphones or sample rates. • We will talk about basic sound properties: • The clarity and localization of the direct sound • The perceived distance between the sound source and the listener (depth) • The recording and reproduction of the sound of the hall.
Major Goals • To review the physical and psychoacoustic properties that make a great recording (or a great performance space). • The clarity of the direct sound (the absence of muddiness) • The creation of a large listening area and a stable front image – using three front speakers in a 5.1 recording. • The blending together of the different instruments into a whole acoustic scene through early reflections. • The re-creation of the acoustic space of the performance, through late reflections and envelopment. • To show how muddiness occurs when there are too many early reflections • To show how we perceive muddiness through our perception of pitch. • To show how the loudspeaker positions in the playback room influences the envelopment at low frequencies. • To play as many musical examples as possible!
Localization – a stable front image over a large listening area • In a high-quality recording the front image does not greatly change when a listener moves away from the sweet spot. • Image stability requires using the center channel speaker in a 5.1 recording. • Even without the center speaker some two channel recordings are more stable than others. • Popular music recordings are often better than classical recordings in image stablilty. • The secret is Amplitude Panning • Which is almost universally used in popular music recording.
Time delay panning • Many engineers attempt to record a broad sound source with closely spaced microphones • Omni microphones are often used in a so-called “Decca Tree”. • Cardioid microphones are often used in the “ORTF” configuration • Both these techniques rely on time delay differences to spread the front image • Time delay spreading only works when the listener is in the sweet spot. • The front image is not stable over a large area.
Training to hear localization • The importance of ignoring the sweet spot • Most research tests of localization use a single listener, who is strictly restricted to the sweet spot. • Your customers will not listen this way! • How do you know if the recording has a stable front image? • Move laterally in front of the loudspeakers. Does the sound image stay wide and fixed to the loudspeakers, or does it follow you? • Do the soloists in the center follow you left or right? If they do they are recorded with too much phantom center. • Since most 5 channel recording methods are derived from stereo techniques almost all have too much phantom center. • A center image that follows a listener who moves laterally out of the sweet spot is the most common failing of even the best five channel recordings. • Play examples
Example: Time delay panning outside the sweet spot. Record the orchestra with a “Decca Tree” - three omni microphones separated by one meter. A source on the left will be picked up with equal level in all three microphones. The time delays will be different by +-3ms. On playback, a listener on the far right will hear this instrument coming from the right loudspeaker. This listener will hear every instrument coming from the right.
Amplitude panning outside the sweet spot. If you record with three widely spaced microphones, an instrument on the left will have high amplitude in the left microphone. The time delay will also be much shorter. A listener on the far right will hear the instrument on the left. Now the orchestra spreads out across the entire loudspeaker basis, even when the listener is not in the sweet spot.
WARNING!!! • In the author’s experience a front image that is not stable when you walk in front of the speakers will never make a great recording. • regardless of how beautiful it is in the sweet spot. • This is my FIRST test of a recording, either two channel or surround.
Summary of acoustic perceptions in a recording • 1. Clarity – the lack of muddiness • Clarity is perceived through the direct sound – sound that travels directly from the instrument to the microphone. • A clear direct sound requires that the microphone be relatively close to the instrument! • 2. Blend and depth • Blend and depth are perceived through early reflections that arrive from all around the listener. • The total energy in these early reflections must be less than the energy in the direct sound! • In a surround recording these reflections should come equally from all the loudspeakers (except the center,) and they must be decorrelated. (different) • 3. Envelopment (reverberation) • Envelopment is perceived through late reflected energy that arrives from all around the listener. (Not just from the rear!) • The energy must be decorrelated in each loudspeaker
Clarity • Clarity to an acoustician is determined through intelligibility – the ability to understand speech or a musical line. • For this talk I will use a different meaning: • For me clarity is the perception that the sound source is acoustically close to the listener. • While this definition may seem vague, almost everyone agrees on the optimal acoustic distance for a recorded sound source. • We can demonstrate this perception:
Muddiness: Dry Speech + 40ms reflections Mono speech: The sound is clear, but much too close to the loudspeaker. Speech with ~40ms allpass reflections and no direct sound. Mono: Stereo: Note both the mono and the stereo version sound muddy and distant. There is no phantom image in the stereo version.
Reflections used in these experiments The reflections used in these experiments form a decaying burst which peaks about 25ms after the direct sound, and has largely decayed away by 50ms. The reflections are different in the two channels, and have a flat frequency response.
Optimum level for Early Reflections • Recorded sound consists of a mix of direct sound and reflections • Too many reflections and muddiness results. • But reflections add a sense of blend and depth. • An optimum mix must be found. • The optimum level for early reflections is -4 to -6dB relative to the direct sound. • This level is preferred by almost every listener. • In a surround recording the reflections should come equally from all directions (except the center), and be decorrelated. • The perceived result is independent of the precise delay time and the pattern of the reflections. • It is the total energy which determines the perception.
Depth without Muddiness • Dry speech • Note the sound is uncomfortably close • Mix of dry with early reflections at -5dB. • The mix has distance (depth), and is not muddy! • Note there is no apparent reverberation, just depth. • Same but with the reflections delayed 20ms at -5dB. • Note also that with the additional delay the reflections begin to be heard as discrete echos. • But the apparent distance remains the same. • Same but with the reflections delayed 50ms at -3dB • Now the sound is becoming garbled. These reflections are undesirable! • If the speech were faster it would be difficult to understand. • Same but with reflections delayed 150ms at -12dB • I also added a few reflections between 20 and 80ms at a level of -8dB to smooth the decay. • Note the strong hall sense, and the lack of muddiness.
The ideal mix • We see from the previous slide that the ideal acoustic mix has three independent perceptual requirements: • 1. The direct sound dominates the total energy by at least 4dB. • 2. There are early reflections that add blend, distance, and depth to the sound. • These should come equally from all directions in a surround recording • And they should avoid adding energy in the 50ms to 100ms time region. • 3. There should be reflections (reverberation) with time delays greater than 150ms to provide the impression of the hall. • To make a great recording we must separately capture all three!
Direction of early reflections • It is not possible to detect whether the reflections come from the front or the rear when they arrive between 20ms and 50ms after the end of a sound. • But it is more natural if they come from both front and rear. • Using all four speakers also results in the largest sweet spot - demo
Muddiness is hard to avoid in small spaces! • We are attempting to show that the optimum total energy for all reflections is at least 4dB less than the direct sound. • The total reflected energy sum does not include the floor reflection. • I will explain why later if there is time. • The direct sound must dominate the total sound picture • The reverberation radius of a small hall or church is usually below 2m, and may be as low as 1m. • Every microphone used in the recording picks up both direct sound and reverberation. • But only the microphone closest to the sound source picks up true direct sound. • Direct sound into all the other microphones is perceived as a reflection, and adds to the potential distance and muddiness.
Muddiness also comes from the playback room! In this room there is no absorption in the front, and thus the reverberation radius is small, perhaps as low as 2.5m. The distance from the front loudspeakers to the listeners is greater than the reverberation radius. So the reverberation will be stronger than the direct sound. We are trying to keep the direct sound stronger than the reflections by 4dB. This goal is probably not possible to achieve in this room! (Except at frequencies above 1000Hz, where the side curtains begin to be absorptive.) Always mix your recordings in an absorbent space!
Boston Cantata Singers Cantata #76Die Himmel erzahlen die Ehre Gottes Performance in Jordan Hall, January 23, 2004. Reverberation time in Jordan ~1.4 seconds at 1000Hz. This is similar to the Semperoper Dresden. The typical audience member is ~ 3 reverb radii from this singer. The dramatic consequences are highly audible. Although Jordan is beloved as a chamber music hall, the stage house is deep and reverberant. When the hall is full, the sound in the audience can be dry and muddy. The recording engineer must overcome these obstacles.
Cantata Singers Bach BWV 76 Multimiked recording. Note the clarity of vocal timbre (low sonic distance). Recording simulating the sound in the hall. Note the timbre coloration and the sense of distance to the performers. With the picture and after adaptation the performance is quite enjoyable.
The Ideal Reverberation • has 20ms to 50ms reflections with a total energy -4dB to -6dB • has relatively little energy from 50 to 150ms.
Most small rooms – (including playback rooms) • Have exponential decay • If we pick up enough late reflections to hear the hall, we will get too many early reflections. • We will get coloration and poor intelligibility.
Example of as small recording space: Swedenborg Chapel, Cambridge
Recording in Sweedenborg Chapel, Cambridge • The chapel holds perhaps 200 people, but when it is empty the RT is ~ 1.8 seconds. • And the reverberation radius is ~ 1.5m • The picture shows four supercardioid microphones about 1m from the chorus. These provide the direct sound. • With the supercardioid pattern we have a 6dB direct/reverberant ratio, so the reverberation is less than the direct sound by about 6dB. • Note that in this space we must add hall sound and early reflections very carefully, or the sound will become muddy! • In addition the early reflections and reverberation arrive soon after the direct sound. The sound seems small and cramped. There is no sense of space around the direct sound. • The chorus microphones are as close as they can be to the chorus without creating balance problems. • We cannot exclude the early reverberation by moving the mikes closer.
Main microphones in Sweedenborg Chapel • The picture also shows two variable pattern microphones about 2m from the chorus. • I put these there for an experiment. The sound is not very good… • The problem with a “main microphone” pair in this space is that it must be placed too far from the singers! • A main pair must be at least 2.5m away or there will be balance problems. • This distance is beyond the reverberation radius, and the sound will be muddy.
Hall Sound in Sweedenborg • The chapel is reverberant – with a high reverberation level • But the reverberation is too strong in the 10-150ms time range. • Using cardioid microphones pointing away from the sound source reduces the early reverberation energy and maximizes the late energy. • The hall sounds larger and better.
Distance Perception and MUD • Reflections during the sound event and up to 150ms after it ends create the perception of distance • But there is a price to pay: • Reflections from 10-50ms do not impair intelligibility. • The fluctuations they produce are perceived as an acoustic “halo” or “air”around the original sound stream. (ESI) • Reflections from 50-150ms contribute to the perception of distance – but they degrade both timbre and intelligibility, producing the perception of sonic MUD. • We will have many examples of mud in this talk!
Training to hear MUD • Mud occurs when the reverberant decay of the recording venue has too much reflected energy in the 10-150ms region of the decay curve. • This is true of nearly all sound stages, small auditoria, and churches. • If you are recording in such a space with a relatively large ensemble, you are in trouble. • The perception of mud can be tricky, because our hearing mechanism adapts to a muddy environment, and the sonic degradation becomes inaudible after about 10 minutes. • It is easy to convince yourself the recording is excellent when you have been listening to it all day. • This is why we can enjoy a concert even when we are sitting far from the instruments. • You MUST compare your recording to a reference recording in a short time A/B test.
Example: John Eargle at Skywalker ranch • John Eargle has made wonderful recordings, particularly those with the Dallas Symphony on Delos Records • But even he can be fooled by a small space • As I said, you adapt quickly to such a space, and no longer hear the mud that it produces. • John Eargle recently made a 5.1 channel DVD audio recording at the Skywalker ranch in Los Angeles. • He was very excited by it – but listen and compare to Dallas. • Skywalker is a large sound stage with controllable acoustics. It is not a concert hall. • As a consequence the reverberation radius is relatively short. By my estimate (without having seen it) the radius is less than 3.5 meters. • It is very easy to record mud in such a space. • Many instruments are beyond the reverb radius. • Adding more microphones only increases the reverberant pickup.
Recording in a large space is much easier! Covenant church is a very large space, holding more than 1000 people. It is damped by pew cushions and acoustic treatment on the walls, yielding a RT of 2.5 seconds and a large reverberation radius – probably above 3m. The microphones can be quite distant without picking up early reflections or reverberation. It is a very good place to record! (And it is exceptionally beautiful visually…)
Example – depth perspective through mike technique: • When the reverberation radius is large enough we can use an extra pair of microphones to create a single early reflection. • This can provide the needed perspective and depth Direct sound: Early reflection: Late reverberation: Direct + Early -5dB: Direct + Early + Late -8dB: Mike 480L Mike 480L
The depth impression is greatly improved in surround • I will run the same experiment, but use all five speakers. • The early reflections will come from both the front and rear equally, but different delay patterns will be used for each speaker. • This means the reflections are decorrelated. • The late (hall) reflections will also come equally but decorrelated in the front and rear speakers. • This will create a large and uniform sweet spot for the acoustics.
The Polyhymnia Pentangle • The Polyhymnia engineers employ a surround array of spaced omni microphones, at a spacing similar to the ITU playback array. • The technique works well in spaces where the reverberation radius is equal to or greater than the microphone spacing! • In this case the direct sound picked up by the rear microphones is perceived as an early lateral reflection and the adds distance to the front image. • Caution!! In a small hall this array will be TOO MUDDY!!! In practice the Polyhemnia engineers often pick up the direct sound with accent microphones. In this case the front microphones provide a first reflection to the front speakers. The center microphone is also often moved closer to the sound sources, so it picks up mostly direct sound.
Boston Symphony Hall • 2631 seats, 662,000ft^3, 18700m^3, RT 1.9s • It’s enormous! • One of the greatest concert halls in the world – maybe the best. • Recording here is almost too easy! • Working here is a rare privilege • Sufficiently rare I do not do it. (It’s a union shop.) • The recording in this talk is courtesy of Alan McClellan of WGBH Boston. (Mixed from 16 tracks by the presenter) • Reverb Radius is >20’ (>6.6m) even on stage. • The stage house is enormous and NOT reverberant. With the orchestra in place, stage house RT = ~1 sec
Boston Symphony Hall, occupied, stage to front of balcony, 1000Hz This picture compares favorably to our picture of the ideal reverberation on a recording. But this is what an audience member hears 100 feet from the stage!
Beware the “main microphone” array • Nearly all engineers will provide a “main microphone” usually a “Decca Tree”, or a pair of omni or cardioid microphones. • Almost always the sound from this array is only acceptable for instruments close to the microphones. • Most of the instruments are far beyond the reverberation radius. • The more distant instruments must be spot-miked. • A cardioid pair (ORTF) has too much phantom center for an acceptable surround recording. (this is a two-channel technique only.) • Very frequently time delay panning (for a Decca Tree or spaced omnis) makes the sound unusable in a high-quality mix. • Time delay panning makes the front image unstable • Closely spaced microphones yield high correlation at low frequencies, which degrades the sense of space. • It is better to simply turn off the main microphone (even if your instructor insists you install one.) • In our Boston Symphony Hall recording a pair of B&K omnis spaced ~25cm was hung behind the conductor by the WGBH engineer. Front pair Front pair LF
Correlation in the “main microphone:” two omnis spaced by ~25cm, just behind the conductor. ___ = measured correlation; - - - = calculated, assuming d=25cm The high correlation in this pair makes the sound unusable in a stereo or surround mix. It sounds unpleasant even in this lecture room, as the audio demo makes clear.
Beware the exclusive use of spaced front microphones • In our recording the wide front orchestra pick-up is fine for the first row of the strings. • But nearly all the orchestra is beyond the reverberation radius for these microphones. • If we want good balance and clarity, we must use additional microphones over the orchestra • And treat these microphones as part of our “main” array. • Using cardioid microphones in front will help a lot. • The cardioid is 4.7 dB less sensitive to reverberation, which will pick out more distant instruments with clarity. • Using super cardioid microphones will help a little bit more. • But if the stage house is reverberant the improvement is minimal. • The author greatly prefers to use (equalized) directional microphones for orchestra and chorus pick-up. • After equalization the bass performance is adequate. • There is better control of leakage, and less MUD.
Balance and distance come first • In any recording the balance between the musical forces should reflect the needs of the music. • In this recording, even with 120 singers the chorus is nearly inaudible in the hall. • So we must heavily use the chorus accent microphones. • In the final mix MOST of the energy in the recording will come from these. In practice, these are our MAIN microphones! • However, if we heavily use the chorus microphones, the chorus will sound too close to the loudspeakers • And in front of the orchestra. • To correct this distance problem we MUST use electronic early reflections. • There is no other possible solution. • Play example
Let’s build the hall sound • We need decorrelated reverberation in both the front and the rear with equal level • Test just the hall microphones to see if the reverberation is enveloping and uniform. • Then add the front microphones for the direct sound. • Where the hall balance is not correct you MUST augment the natural reverberation with electronics. • In this recording the orchestra is much stronger than the chorus – even with 120 singers – and there is too little chorus in the natural reverberation!! • When we add the accent microphones the chorus will sound as if they are in a smaller space. • So we add electronic reverberation from the chorus (equally in all four outer speakers) from the surround reverberator.
Final Mix • The final mix uses the three omni microphones over the chorus as the main microphones. They are simply patched to left, center, and right. • The spot microphones for the soloists are mostly mixed to the center, with some panning to the left or right. (No divergence was used.) • The orchestra is a combination of two wide spaced omnis patched to left front and right front. • Augmented by spot microphones over the woodwinds and the more distant strings. • the center channel was provided automatically through leakage from the soloists’s microphones. • The rear channels come from a widely spaced pair of omnis about 20 feet behind the conductor, • Extensively augmented by electronic early reflections and late reverberation.
Hall sound: decorrelation at low frequencies. • It is widely believed that localization is impossible below 100Hz. • So a single subwoofer has become the standard for reproducing low frequencies. • Although localization below 100Hz is difficult in a small room, there is a large difference between a single subwoofer and an independently driven pair. • We have turned off the subwoofer in this room and we are running the other speakers full-range. • A great recording will easily demonstrate the difference between a single subwoofer and full-range discrete speakers. • As a consequence you must be sure the hall sound in your recordings is decorrelated at low frequencies! • Both in the front and in the rear of a surround recording. • Most single microphone array surround techniques fail for this reason.