560 likes | 642 Views
SOUND Laura Hyland cs.aue.auc.dk/~laura. What sound is?.
E N D
SOUND Laura Hyland cs.aue.auc.dk/~laura
What sound is? Sound is a form of energy. When we give energy to a body (by hitting or exciting it) we set that body in motion – it will vibrate. This body in turn will set the air around it in motion, causing it to vibrate also. The vibrations in the air will reflect the vibrations of the body. These vibrations travel through the air in waves until they reach our ears where we perceive them as sound. For example, if we excite a tuning fork by striking it the energy given causes the tines to vibrate very quickly. The movement of the tines is too small for the eye to percieve but not for the ear. These vibrations can also be felt when the tuning fork is brought in contact with the skin.
Sound II As the tines move back and forth they exert pressure on the air around them. (a) The first displacement of the tine compresses the air molecules causing high pressure. (b) Equal displacement of the tine in the opposite direction forces the molecules to widely disperse themselves and so, causes low pressure. (c) These rapid variations in pressure over time form a pattern which propogates itself through the air as a wave. Points of high and low pressure are sometimes reffered to as ’compression’ and ’rarefaction’ respectively. (a) compression (b) rarefaction (c) wave propegation of a tuning fork as seen from above
Simple harmonic motion If we look at the way one tine moves we see that it only moves in one plane back and forth. In addition to this it moves the same distance back and forth at a constant speed. We could say that it is trying to reach equlibrium – it is trying to get back to its original stable position. This kind of response of an object to excitation is characteristic to many objects. A pendulum is another example of this movement – it is also easier to visualize than the tines of a tuning fork! Think of the movement of a pendulum when put into motion; it swings back and forth but will always come to rest at its original position. This is called Simple Harmonic Motion.
Simple harmonic motion Note with the pendulum that even when it approaches equlibrium it doesnt slow down – it simply travels a smaller distance from the point of rest. This is also the case for the tine of the tuning fork. Thus, we can say that any body undergoing simple harmonic motion moves periodicallywith uniform speed. We can also say that if the tine is moving periodically then the pressure variations it creates will also be periodic. The time taken to get from position a to b in all three cases is the same a b a b a b Maximum displacement after say, 6 seconds Maximum displacement at 0 seconds Maximum displacement after say, 3 seconds
The unit circle These pressure patterns can be represented using as a circle. Imagine the journey of the pendulum or the tine in four stages: 1) from its point of rest to its first point of maximum displacement... 2) its first point of maximum displacement back through the point of rest... 3) ... to its second point of maximum displacement... 4) ... and back from there through its point of rest again We can map that journey to a circle. This is called the Unit Circle. The sine wave represents this journey around and around the unit circle over time. 3 4 2 1 Time
Sine wave The sine wave or sinusoid or sinusoidal signal is probably the most commonly used graphic representation of sound waves. The diagram below shows one cycle or ’period’ of a wave, i.e., the build-up from equilibrium to maximum high pressure, to maximum low pressure, to equilibrium again. A sine wave sounds like this... high pressure or ’compression’ + 1 low pressure or ’rarefaction’ Pressure or density of air molecules; ’Amplitude’ in deciBels 0.5 1 0 -1 Time in seconds
Sine waves The specific properties of a sine wave are described as follows. Amplitude = variations in air pressure (measured in decibels) Phase = The starting point of a wave along the y-axis (measured in degrees) Period = The time need to come to exact the same location 1 second
Frequency Frequency refers to the number of cycles of a wave per second. This is measured in Hertz.So if a sinusoid has a frequency of 100Hz then one period of that wave repeats itself every 1/100th of a second. The diagram below shows a 100Hz sine and an 800Hz sine. For every 8 periods of (a) there is one period of (b). Humans can hear frequencies between 20Hz and 20,000Hz (20Khz). There are three important things to remember about frequency: 1) Frequency is closely related to, but not the same as!!!,pitch. 2) Frequency does not determine the speed a wave travels at. Sound waves travel at approximately 340metres/second regardless of frequency. f=c/l 3) Frequency is inherent to, and determined by the vibrating body – not the amount of energy used to set that body vibrating. For example, the tuning fork emits the same frequency regardless of how hard we strike it. (a) 800Hz (b) 100Hz
Wavelength Wavelength describes the length of one period of a wave, or twice the distance between one zero crossing point and the next. (A zero crossing point refers to the point at which the wave crosses the x-axis. This represents the point at which there is no pressure variation, i.e., the point where air molecules return to their original position.) It is important to have a sense of the actual physical size of a wave. The speed of sound in air is approximately 340 metres per second*. Consider a wave of frequency 20 Hz, i.e., a pressure pattern repeating itself 20 times a second. 20 periods back to back have a length of 340metres so 1 period = 340/20 = 17 metres. Similarly, a wave at a frequency of 20kHz will be 340/20,000 in length = 0.017 metres or 1.7mm. This property is important for formants! Wavelength = Speed of sound in air / Frequency 1 period + 1 0 -1 Zero crossing points *This is dependant on air temperature – the higher the temp the more freely air molecules will move, therefore the faster the wave will travel.
Amplitude Amplitude describes the size of the pressure variations. It is measured along the vertical y-axis. Think of the pendulum or the tuning fork; the wider the displacement of the pendulum or tine, the larger the amplitude is. Amplitude is closely related to but not the same as!!!,loudness. Hence the reason the tuning fork sounds louder when we strike it hard. We will examine the relationship between amplitude and loudness later... (a) Two signals of equal frequency and varying amplitude (b) Two signals of varying frequency and equal amplitude
Amplitude Envelope The amplitude of a wave changes or ’decays’ over time as it loses energy. These changes are normally broken down into four stages; Attack , Decay, Sustain and Release. Each stages is measured in milliseconds. For example, the signal below has an attack of 100ms. That means its amplitude goes from 0dB to 0.8dB in 100ms. Similiarly, it has a decay of 90ms so its amplitude goes from 0.8dB to ~0.38dB in 90ms, etc. Collectively, the four stages are described as the amplitude envelope. Attack Decay Sustain Release Slow attack 0.8 0.6 0.4 Fast attack 0.2 0 100 200 300 400 600 700 500
Phase Consider the Unit Circle again. So far we have mapped the journey around the circle starting from the point corrosponding to an amplitude of 0 on the y-axis but we can offset the starting point so that we begin mapping from another point on the circle. The offset from the point 0 on the circle will determine the initial phase of the sine wave, i.e., the starting point of the wave along the y-axis. This offset is the phase of the sine wave and it is measured in degrees. Figure (b) below has a phase shift of 170 degrees 0 degrees 260 180 (a) 0 90 170 degrees (b)
Phase Why offset the start time? As we will see later, it’s often neccessary to look at several signals together, each one having a different start time. The easiest way to represent this time difference is using phase. For example, take two 100Hz signals. Say (a) starts at 0 seconds and (b) starts at 1.375secs. (a) will be ¾ ways through its 131st cycle when (b) begins. We can represent this time difference by giving (a) a phase shift of 260 degrees or ¾ of a cycle. That is, when (b) is starting out at amplitude 0, (a) is at amplitude 1. So, phase can be defined as the representation of the time delay between two signals. (a) 0secs 1.375secs (b) 0secs 1.375secs
Wave Superposition If we add these two 100hz signals together we see that points of high pressure in (a) correspond with points of low pressure in (b). Thus, they cancel each other out and the result we get is no pressure variations at all! The picture of this?... (c)! When two waves intereact with each other like this it is called interference. (a) + (b) = (c)
Wave Superposition In the previous example the two waves cancelled each other out, resulting in a decrease of pressure variations. This is Destructive Interference. The following example shows a case where two waves interacting result in an increase in pressure variations. This is called Constructive Interference. We have the same two signals but this time they start ’in phase’, i.e., at the same time. (a) + (b) = (c)
Wave Superposition Constructive and destructive interference due to phase cancellation of two waves of equal amp, freq and direction. Standing wave due to two waves of equal amplitude, frequency and phase travelling in opposite directions.
Wave Superposition If we take two sine waves which are very close in frequency we experience a phenomena called beating. Beating can be described as a periodic variation in amplitude. These variations occur at at rate of f1 - f2(where f1 the higher frequency and f2 the lower). The frequency of the new signal will be the average of f1 and f2. Forexample, when a 440 Hz. and 442 Hz. sinusoids are combined we hear 2 beats per second and tone whose frequency is 441 Hz. 440hz 442hz 441hz Beating sounds like this... (170hz + 174hz)
Fundamentals, Harmonics & Partials So far we have investigated interference between two sinusoids of equal frequency and also that between very close frequencies. Now we need to consider some other relationships. What happens when one sine wave is exactly half the frequency of the other? In the diagram below we see two sinusoids with frequencies, 220hz and 440hzand both have a 0 degree phase so for every one period of (a) we get two periods of (b). We can hear this relationship as well as see it. Listen to both frequencies – we hear the same note but one is much ’higher’ than the other. If both are played together we hear one one tone, not two! We will investigate the reason for this later when we look at pitch. In the following diagram this pattern is extended to five sinusoids. In this case the 2nd sinusoid is twice the 1st, the 3rd is three times the 1st, the 4th is four times the 1st and so on....
Fundamentals, Harmonics & Partials 5th Harmonic or 6th partial e) 500hz 3rd Harmonic or 4th partial d) 400hz 2nd Harmonic or 3rd partial c) 300hz 1st Harmonic or 2nd partial b) 200hz ’Fundamental frequency’ or ’1st partial’ a) 100hz
Fundamentals, Harmonics & Partials Visually, it is clear that there is a relationship between all these sine waves. Aurally, it is also clear; when all five sines are played together we perceive it as one tone. Numerically there is also a relationship – the frequencies are all multiples of the first frequency. There is an integer relationship between the frequencies of all these sinusoids. (integers are whole numbers like 3, 5, -8, 120, etc). In a set of sinusoids like this the first frequency is reffered to as the fundamental frequency. Subsequent sinusoids are called harmonics. The whole set together is called the Harmonic Series. Another term often used in this context is a ’partial’. NOTE: a partial is a generic term to describe any component of a sound, for example, a sinusoid of 318hz in this set is a partial but NOT a harmonic because 318 is not an even multiple of 100. The fundamental frequency is also a partial.Thus, all harmonics are partials but not all partials are harmonics!
The Harmonic Series Notice that there is also a relationship between the amplitude of partials comprising the harmonic series. Amp f5 = amp f1/5 Amp f4 = amp f1/4 Amp f3 = amp f1/3 Amp f2 = amp f1/2 Amp f0 = 1
Wave Superposition of Harmonically Related Signals Notice that the signals being added are the 1st, 3rd and 5th harmonics of a series where f0 = 100, i.e., the odd harmonics. Now look at the resultant diagram to the right. Notice that the emergent ’shape’ approaches a square. The combination of odd harmonics will always give a square wave.
Time Domain vs Frequency Domain 50 80 Frequency 140 600 Time 2700
Time Domain vs Frequency Domain Even though we cannot see sound we have to remember that it has physical dimensions and exists in space. Thus, like any other physical body it is 3-dimensional. The previous figure shows a sine wave in the three dimensional plane. Along the horizontal x - axis we have time; on the vertical y-axis we have amplitude; on the diagonal z-axis we have frequency. Up until now we have viewed sine waves in the TIME DOMAIN only. So why use both views? What can we see in one view that we can’t see in the other? First let’s look at the Time Domain again; Here we can see time, of course, and phase and amplitude. We can also see low frequencies if its a single sine wave and the diagram is big enough, however, only so far as the eye can count the resolutions of the pressure pattern. Also, if there is more than one sinusoid identifying frequency is practically impossible. Obviously this is not very a sophisticated or accurate way of identifying frequency! Hence the reason we need to be able to switch views between one axis and the other. (If you think of architectural drawings of a building in plan and elevation In might make it easier to conceptualize this switching between views.)
Time Domain vs Frequency Domain Now, if we look at the sinusoid in the FREQUENCY DOMAIN what can we see? Obvoiusly we can see Frequency. We can also see amplitude. However, we cannot see time or phase at all. Look at the following diagrams to see the same signal represented in the time domain and frequency domain. Time Domain Frequency Domain Amp Amp 20hz 220 440 800 2.5Khz 7Khz 20Khz Time Frequency 440hz tone at an amplitude of 0.87 and phase shift of 260 degrees 220hz tone at an amplitude of 0.4 and a phase shift of 90 degrees
Fourier Transform, I • A physical process can be described either in the time domain by a function of time t, h(t), or in the frequency domain as a function of frequency f, H(f) • h(t) and H(f) are two different representations of the same process. • One goes back and forth between these two representations by means of the Fourier transform, • Dirac delta function: • Using angular frequency =2f,
Examples, I 1 d(t) cos(w0t) w t +w0 -w0 0 0 w t • FT of Dirac delta function: • FT of cos(0t)
Examples, II exp(iw0t) Im t 0 Sum Re t 0 • FT of exp(2if0t)=exp(i0t) F{exp(iw0t)} w w0 0 FT w 0
Properties • Correspondence between symmetries in the two domains: • Scaling and shifting
Scaling w t w t t w Shortpulse Medium-lengthpulse Longpulse
Properties, II • With two functions h(t) and g(t), and their FT H(f) and G(f), the convolution, g*h, is defined by • Convolution theorem: the FT of the convolution is the product of the individual FTs. • The correlation, Corr(g,h) • Correlation theorem (for two real functions, g and h): • Autocorrelation, Wiener-Khinchin theorem: • Parseval’s theorem:
Sampling theorem, I • Suppose function h(t) is sampled at evenly spaced intervals in time; • 1/: Sampling rate • For any sampling interval , there is a special frequency fc, called Nyquist frequency, given by • ex: critical sampling of a sine wave of Nyquist frequency is two sample points per cycle. • A function f is “bandwidth limited” if its Fourier transform is 0 outside of a finite interval [-L, L] • Sampling Theorem: If a continuous function h(t), sampled at an interval , is bandwidth limited to frequency smaller than fc, i.e., H(f)=0 for all |f|>fc, then the function h(t) is completely determined by its samples hn.
Sampling theorem, II • For bandwidth limited signals, such as music in concert hall, sampling theorem tells us that the entire information content of the signal can be recorded by sampling rate -1 equal to twice the maximum frequency pass by the amplifier. • For the function that is not bandwidth limited to less then the Nyquist critical frequency, frequency component that lies outside of the frequency range, -fc < f < fc is spuriously moved into that range (aliasing). • Demo applet • http://www.cs.brown.edu/exploratories/freeSoftware/repository/edu/brown/cs/exploratories/applets/nyquist/nyquist_limit_java_plugin.html
Discrete Fourier transform, I • Suppose we have N consecutive sampled values where is the sampling interval, and assume N is even. • With N numbers of input, we can produce no more than N independent number of outputs. Therefore, we seek estimates only at the discrete values; • Then, discrete Fourier Transform (DFT) is • DFT maps N complex numbers (the hk’s) into N complex numbers (the Hn’s)
Discrete Fourier transform, II • It’s periodic in n, with period N; H-n = HN-n, n=1,2,… • With this conversion, one lets the n in Hn vary from 0 to N-1. Then n and k vary exactly over the same range. • With this convention, • zero frequency n=0 • positive frequencies, 0 < f < fc 1 n N/2-1 • negative frequencies, –fc < f < 0 N/2+1 n N-1 • the value n = N/2 both f = fc and f = -fc • The DFT has symmetry properties almost exactly the same as the continuous Fourier transform.
Discrete Fourier transform, III • The discrete inverse Fourier transform is • Proof: • Parseval’s theorem:
Loudness • What is loudness? How can it be meaured? There are two ways of doing so: • Measuring the sound pressure level (SPL)of a wave. That is, measuring how • much pressure the vibrations of an object exert on the air around it. • Or by measuring the sound intensity level (SIL) . That is, measuring how much • energy a wave carries through the air. • So, loudness is the perceptual quality corrosponding to changes in SPL or SIL. • However, loudness is relative – that is, to describe the loudness of one sound we need to reference some other sound. For example, late at night you may turn your music down. After some minutes it seems perfectly loud but if you listen to it the following morning at the same ’volume’ it probably seems very quiet. So, loudness of one sound is relative to some other sound. This is summarised in the Weber-Fechner Law which states that • ”the increase in Intensity needed to produce a given increase in perceived loudness is proportional to the pre-existing intensity.” • This is similar to weight judgement or brightness of light.
Sound Intensity Sound Intesity = rate of energy radiation in watts per metre squared per second Imagine a spherical sound source radiating the same amount of energy at the same rate in all directions at the same time.That energy is recieved or ’picked up’ by some surface – say our eardrum or a microphone. Intensity falls off with distance squared A Watts per second 1metre squared B • The reception of energy is proportional to the recieving surface area. i.e., the bigger • the surface area, the more energy will be picked up. • Intensity falls off with distance squared, i.e., the greater the distance between source • and receptor, to lower the intensity will be. • The smallest detectable sound intensity is detectable by the ear is 10^-12 Watts/m^2; • the largest is 10^0 W/m^2 (i.e. 1W/m^2). • These two extremes are called the threshold of hearing and the threshold of pain, • respectively.
Sound Pressure Level When a body vibrates with more energy its displacement from equilibrum is greater. So too then is the displacement of air molecules. The result is cycles of densely and sparsely packed molecules. The figure below shows a representation of two sines at the same frequency with different amplitudes. We can see that when the amplitude is higher the molecules are more densely packed together and thus, create higher pressure. This is what ’sound pressure level’ refers to. So, higher pressure means more densely packed molecules which means more ’impact’ on the ear. (think of the different between being hit by a sponge and being hit by a stone; stone = denser material = more pressure = more impact.)
Loudness - Not a Linear Scale! 2* 10^-5N/m^2 2*10^1N/m^2 Sound pressure Level 10^0W/m^2 10^-12W/m^2 Sound Intensity Level Threshold of pain Loudness Threshold of hearing The problem with trying to measure loudness it that it is not linear. That means a doubling in intensity or in pressure does not necessarily corrospond to a perceived doubling in loudness! Thus, loudness increases logarithmically SPL or SIL Loudness
Relationship between SPL and SIL? An increase in sound intensity is proportional to the square of the pressure amplitude. In other words, as the amplitude doubles the sound intensity is quadrupled. This relationship requires a bit of a mathematical detour but here it is sufficient to simply remember that when measuring Sound pressure level we use the equation: 20 Log P1/P0 = SPL (in decibels) ...and when measuring Sound Intensity Level we use the equation: 10 Log I1/I0 = SPL (in decibels)
The Decibel Scale So, by converting the linear SPL and SIL scales to logarithmic scales we now have an accurate loudness scale. We will see shortly that the relationship between pitch and frequency is also non linear – logarithmic scales will reappear there too! SPL or SIL Non-Linear Perceived Loudness Log (SPL or SIL) Linear Perceived Loudness
Other ways of measuring Loudness? So far we’ve seen that there are two ways of measuring loudness – SIL and SPL. However, SPL is the most commonly used scale for loudness measurment. For example, SPL is used to measure the level of noise on a building site for example, or in a concert hall in the centre of a busy city. Musicians use the following dynamic markings to indicate loudness; ppp - quiet as possible pp - very quiet p - quiet f - loud ff - very loud fff - loud as possible crescendo diminuendo
Loudness Perception • Now that we’ve found a way to quantify loudness what can we tell about our • experience of it? Is our perception of loudness the same under all circumstances for all sounds? No! In the following slides we will see that loudness is dependant on: • Frequency (S) • Presence of other sounds (S) • Duration (T) • Adaption (T) • These factors can be divided into Spectral and Temporal characteristics of Loudness, indicated by ’S’ and ’T’ above.
Frequency: Fletcher Munson Curves The most comprehensive study of loudness perception at different frequencies is shown in the Fletcher Munson Curves. These curves demonstrate the relationship between sound pressure and frequency and the resulting loudness we perceive.
Fletcher Munson Curves • Each line is called an equal loudness contour. The reference point for Fletcher Munson Curves is a 1000hz tone at 40dB. The ’question’ posed by the graph is • ”by what amount must the SPL be increased for a tone of frequency x before it sounds as loud as a 1000hz tone at 40dB?” • For example, we can see from the graph that a frequency of 20hz at an SPL of ~80dB will sound as loud as a frequency of 100hz at ~35dB. Staying on the same contour, 1000hz will sound as loud at 20dB, etc. • Important points to note about ELCs is that: • The smallest amplitude required to match the 1000hz reference tone is at ~3000 • - 5000hz. This indeicates that we are more sensetive to frequencies in this range. • (This makes sense considering that the range of the human voice roughly • corrosponds to this range.) • Higher ECLs relating to higher amplitudes are flatter than lower level • ECLs. So, at higher amplitudes audible frequencies are more similar in loudness • than at lower frequencies. • At low listening levels there is a fall off of bass frequencies – i.e., amplitude • smooths out at higher frequencies.
Fast Fourier Transform (FFT) • DFT appears to be an O(N2) process. • Danielson and Lanczos; DFT of length N can be rewritten as the sum of two DFT of length N/2. • We can do the same reduction of Hk0 to the transform of its N/4 even-numbered input data and N/4 odd-numbered data. • For N = 2R, we can continue applying the reduction until we subdivide the data into the transforms of length 1. • For every pattern of log2N number of 0’s and 1’s, there is one-point transformation that is just one of the input number hn
Signal representation on time-frequency plane (synthetic /a/)