530 likes | 544 Views
Explore the world of sound, from energy conversion to waveform analysis. Learn about sound frequency, intensity, and digitization techniques including sampling rate and quantization. Understand digital audio file sizes and compression methods in this informative guide.
E N D
Digital Sound Dr. Kairui Chen GGC
What is sound? • Conversion of energy into vibrations in the air or some other elastic medium • Vocal chords • Tuning fork • Guitar strings
Waveforms • Sounds change over time (or sound is a function of time) • e.g. speech changes constantly • Frequency spectrum – relative amplitudes of the frequency components • alters as sound changes • Waveform is a plot of amplitude against time • Provides a graphical view of characteristics of a changing sound • Can identify syllables of speech, rhythm of music, quiet and loud passages, etc
Frequency of Sound Wave • Refers to the number of complete back-and-forth cycles of vibrational motion of the medium particles per unit of time • Unit for frequency: Hz (Hertz) • 1 Hz = 1 cycle/second
Frequency Suppose it is1 second a cycle a cycle Frequency = 2 Hz (i.e., 2 cycles/second)
Frequency Suppose it is1 second a cycle a cycle a cycle a cycle Frequency = 4 Hz (i.e., 4 cycles/second) Higher frequency than the previous waveform.
Frequency • Sound frequency often referred to as pitch of the sound. • Higher pitch -> higher frequency • Lower pitch -> lower frequency • Range of human hearing: roughly 20Hz–20kHz, varies from person to person and falls as we age
Sound Intensity • Sound intensity: • an objective measurement • can be measured with auditory devices • in decibels (dB) • 0 dB: • Threshold of hearing • minimum sound pressure level at which humans can hear a sound at a given frequency • does NOT mean zero sound intensity • does NOT mean absence of sound wave • about 120 dB: • threshold of pain • sound intensity that is 1012 times greater than 0 dB
A Single Tone Sound: A Simple Sine Wave Waveform A sinlge sine wave waveform A single tone
Adding Sound Waves Most sound sources vibrate in complex ways leading to sounds with components at several different frequencies. A sinlge sine wave waveform A single tone A second sinlge sine wave waveform A second single tone A more complex waveform A more complex sound
Digitizing Sound Suppose we want to digitize this sound wave:
Effects of Sampling Rate original waveform sampling rate = 10 Hz sampling rate = 20 Hz
Effects of Sampling Rate Higher sampling rate: • The reconstructed wave looks closer to the original wave; • More sample points, more data to record, and thus larger file size;
Estimate Thresholds of Sampling Rate Based on Human Hearing Let's consider these two factors: Human hearing range A rule called Nyquist's theorem
Nyquist Theorem We must sample at least 2 points in each sound wave cycle to be able to reconstruct the sound wave satisfactorily. Sampling rate of the audio twice of the audio frequency (called a Nyquist rate) Sampling rate of the audio is higher for audio with higher pitch
Choosing Sampling Rate: Example 1 If we consider human ear's most sensitive range of frequency (2,000 Hz to 5,000 Hz), then what is the lowest sampling rate may be used that still satisfies the Nyquist Theorem? • 11,025 Hz AM Radio Quality/Speech • 22,050 Hz Near FM Radio Quality (high-end multimedia) • 44,100 Hz CD Quality • 48,000 Hz DAT (digital audio tape) Quality • 96,000 Hz DVD-Audio Quality • 192,000 Hz DVD-Audio Quality
Choosing Sampling Rate: Example 2 Given the human hearing range (20 Hz to 20,000 Hz) and Nyquist Theorem, why do you think the sampling rate (44,100 Hz) for the CD-quality audio is reasonable?
Sampling Rate Examples • 11,025 Hz AM Radio Quality/Speech • 22,050 Hz Near FM Radio Quality (high-end multimedia) • 44,100 Hz CD Quality • 48,000 Hz DAT (digital audio tape) Quality • 96,000 Hz DVD-Audio Quality • 192,000 Hz DVD-Audio Quality
Digitization: Quantization • Each of the discrete samples of amplitude values obtained from the sampling step are mapped and rounded to the nearest value on a scale of discrete levels. • The number of levels in the scale is expressed in bit depth--the power of 2. • More levels: more accurate mapping, better quality, but larger file size • Less levels: less accurate mapping, worse quality, but smaller file size • Bit depth of a digital audio is also referred to as resolution. • For digital audio, higher resolution means higher bit depth. • An 8-bit audio allows 28 = 256 possible levels in the scale • only use if some distortion is acceptable, e.g. voice communication • CD-quality audio is 16-bit (i.e., 216 = 65,536 possible levels)
Digital Audio File Size • File size of uncompressed digital audio is determined by: • Sampling rate (r); • Bit depth (s); • Number of channels; • Mono: single channel; • Stereo: two channels; • Multiple channels; • Duration of the audio in seconds (t);
Let's estimate the file size of a 1-minute CD-quality audio file
1-minute CD Qualtiy Audio • Sampling rate = 44100 Hz(i.e., 44,100 samples/second) • Bit depth = 16(i.e., 16 bits/sample) • Stereo(i.e., 2 channels: left and right channels)
File Size of 1-min CD-quality Audio • 1 minute = 60 seconds • Total number of samples= 60 seconds 44,100 samples/second= 2,646,000 samples • Total number of bits required for these many samples= 2,646,000 samples 16 bits/sample= 42,336,000 bitsThis is for one channel. • Total bits for two channels= 42,336,000 bits/channel 2 channels= 84,672,000 bits
File Size of 1-min CD-quality Audio 84,672,000 bits= 84,672,000 bits / (8 bits/byte)= 10,584,000 bytes= 10,584,000 bytes / (1024 bytes/KB) 10336 KB= 10336 KB / (1024 KB/MB)10 MB
General Strategies to Reduce Digital Media File Size • Reduce sampling rate • Reduce bit depth • Apply compression • For digital audio, these can also be options: • reducing the number of channels • shorten the length of the audio
Reduce Sampling Rate • Sacrifices the fidelity of the digitized audio • Need to weigh the quality against the file size • Need to consider: • human perception of the audio(e.g., How perceptibe is the audio with lower sampling rate?) • how the audio is used • music: may need higher sampling rate • short sound clips such as explosion and looping ambient background noise: may work well with lower sampling rate
Effect of Sampling Rate on File Size File size = duration sampling rate bit depth number of channels • File size is reduced in the same proportion as the reduction of the sampling rate • Example: Reducing the sampling rate from 44,100 Hz to 22,050 Hz will reduce the file size by half.
Effect of Bit Depth on File Size File size = duration sampling rate bit depth number of channels • File size is reduced in the same proportion as the reduction of the bit depth • Example: Reducing the bit depth from 16-bit to 8-bit will reduce the file size by half.
Most Common Choices of Bit Depth • 8-bit • usually sufficient for speech • in general, too low for music • 16-bit • minimal bit depth for music • 24-bit • 32-bit
Effect of Number of Channels on File Size File size = duration sampling rate bit depth number of channels • File size is reduced in the same proportion as the reduction of the number of channels • Example: Reducing the number of channels from 2 (stereo) to 1 (mono) will reduce the file size by half.
Digital Sound Editing • Software: Audacity (tutorial and hands on activity will be given in class). • Timeline divided into tracks • Sound on each track displayed as a waveform • 'Scrub' over part of a track e.g. to find pauses • Cut and paste, drag and drop • May combine many tracks from different recordings (mix-down)
Effects and Filters • Noise gate: remove hiss from music • Low pass and high pass filters • Notch filter: removes a single narrow frequency band • De-esser: removes the sibilance • Click repairer: removes clicks from recordings taken from old vinyl records • Reverb: echo effect • etc
Audio File Compression • Lossless • Lossy • gets rid of some data, but human perception is taken into consideration so that the data removed causes the least noticeable distortion • e.g. MP3 (good compression rate while preserving the perceivably high quality of the audio)
Compression • In general, lossy methods required because of complex and unpredictable nature of audio data • CD quality, stereo, 3-minute song requires over 25 Mbytes • Data rate exceeds bandwidth of dial-up Internet connection • Difference in the way we perceive sound and image means different approach from image compression is needed
Companding • Non-linear quantization • Higher quantization levels spaced further apart than lower ones • Quiet sounds represented in greater detail than loud ones
ADPCM • Differential Pulse Code Modulation • Similar to video inter-frame compression • Compute a predicted value for next sample, store the difference between prediction and actual value • Adaptive Differential Pulse Code Modulation • Dynamically vary step size used to store quantized differences
Perceptually-Based Compression • Identify and discard data that doesn't affect the perception of the signal • Needs a psycho-acousticalmodel, since ear and brain do not respond to sound waves in a simple way • Threshold of hearing – sounds too quiet to hear • Masking – sound obscured by some other sound
Compression Algorithm • Split signal into bands of frequencies using filters • Commonly use 32 bands • Compute masking level for each band, based on its average value and a psycho-acoustical model • i.e. approximate masking curve by a single value for each band • Discard signal if it is below masking level • Otherwise quantize using the minimum number of bits that will mask quantization noise
MP3 • MPEG Audio, Layer 3 • Three layers of audio compression in MPEG-1 (MPEG-2 essentially identical) • Layer 1...Layer 3, encoding proces increases in complexity, data rate for same quality decreases • e.g. Same quality 192kbps at Layer 1, 128kbps at Layer 2, 64kbps at Layer 3 • 10:1 compression ratio at high quality
AAC • Advanced Audio Coding • Defined in MPEG-2 standard, extended and incorporated into MPEG-4 • Not backward compatible with earlier standards • Higher compression ratios and lower bit rates than MP3 • Subjectively better quality than MP3 at the same bit rate
Audio Formats • Platform-specific file formats • AIFF (mac), WAV (windows), AU (unix) • Multimedia formats used as 'container formats' for sound compressed with different codecs • QuickTime, Windows Media, RealAudio • MP3 has its own file format, but MP3 data can be included as audio tracks in QuickTime movies and SWFs
MIDI • Musical Instruments Digital Interface • Instructions about how to produce music, which can be interpreted by suitable hardware and/or software • cf. vector graphics as drawing instructions • Standard protocol for communicating between electronic instruments (synthesizers, samplers, drum machines) • Allows instruments to be controlled by hardware or software sequencers
MIDI and Computers • MIDI interface allows computer to send MIDI data to instruments • Store MIDI sequences in files, exchange them between computers, incorporate into multimedia • Computer can synthesize sounds on a sound card, or play back samples from disk in response to MIDI instructions • Computer becomes primitive musical instrument (quality of sound inferior to dedicated instruments)
MIDI Messages • Instructions that control some aspect of the performance of an instrument • Status byte – indicates type of message • 2 data bytes – values of parameters • e.g. Note On + note number (0..127) + key velocity • Running status – omit status byte if it is the same as preceding one
Choosing an Audio File Type Determined by the intended use • File size limitation • Intended audience • Whether as a source file
File Size Limitations • Is your audio used on the Web? • file types that offer high compression • streaming audio file types