Digital Sound

Digital Sound Dr. Kairui Chen GGC

What is sound? • Conversion of energy into vibrations in the air or some other elastic medium • Vocal chords • Tuning fork • Guitar strings

Waveforms • Sounds change over time (or sound is a function of time) • e.g. speech changes constantly • Frequency spectrum – relative amplitudes of the frequency components • alters as sound changes • Waveform is a plot of amplitude against time • Provides a graphical view of characteristics of a changing sound • Can identify syllables of speech, rhythm of music, quiet and loud passages, etc

Frequency of Sound Wave • Refers to the number of complete back-and-forth cycles of vibrational motion of the medium particles per unit of time • Unit for frequency: Hz (Hertz) • 1 Hz = 1 cycle/second

Frequency Suppose it is1 second a cycle a cycle Frequency = 2 Hz (i.e., 2 cycles/second)

Frequency Suppose it is1 second a cycle a cycle a cycle a cycle Frequency = 4 Hz (i.e., 4 cycles/second) Higher frequency than the previous waveform.

Frequency • Sound frequency often referred to as pitch of the sound. • Higher pitch -> higher frequency • Lower pitch -> lower frequency • Range of human hearing: roughly 20Hz–20kHz, varies from person to person and falls as we age

Sound Intensity • Sound intensity: • an objective measurement • can be measured with auditory devices • in decibels (dB) • 0 dB: • Threshold of hearing • minimum sound pressure level at which humans can hear a sound at a given frequency • does NOT mean zero sound intensity • does NOT mean absence of sound wave • about 120 dB: • threshold of pain • sound intensity that is 1012 times greater than 0 dB

A Single Tone Sound: A Simple Sine Wave Waveform A sinlge sine wave waveform A single tone

Adding Sound Waves Most sound sources vibrate in complex ways leading to sounds with components at several different frequencies. A sinlge sine wave waveform A single tone A second sinlge sine wave waveform A second single tone A more complex waveform A more complex sound

Digitizing Sound Suppose we want to digitize this sound wave:

Effects of Sampling Rate original waveform sampling rate = 10 Hz sampling rate = 20 Hz

Effects of Sampling Rate Higher sampling rate: • The reconstructed wave looks closer to the original wave; • More sample points, more data to record, and thus larger file size;

Estimate Thresholds of Sampling Rate Based on Human Hearing Let's consider these two factors: Human hearing range A rule called Nyquist's theorem

Nyquist Theorem We must sample at least 2 points in each sound wave cycle to be able to reconstruct the sound wave satisfactorily. Sampling rate of the audio  twice of the audio frequency (called a Nyquist rate) Sampling rate of the audio is higher for audio with higher pitch

Choosing Sampling Rate: Example 1 If we consider human ear's most sensitive range of frequency (2,000 Hz to 5,000 Hz), then what is the lowest sampling rate may be used that still satisfies the Nyquist Theorem? • 11,025 Hz AM Radio Quality/Speech • 22,050 Hz Near FM Radio Quality (high-end multimedia) • 44,100 Hz CD Quality • 48,000 Hz DAT (digital audio tape) Quality • 96,000 Hz DVD-Audio Quality • 192,000 Hz DVD-Audio Quality

Choosing Sampling Rate: Example 2 Given the human hearing range (20 Hz to 20,000 Hz) and Nyquist Theorem, why do you think the sampling rate (44,100 Hz) for the CD-quality audio is reasonable?

Sampling Rate Examples • 11,025 Hz AM Radio Quality/Speech • 22,050 Hz Near FM Radio Quality (high-end multimedia) • 44,100 Hz CD Quality • 48,000 Hz DAT (digital audio tape) Quality • 96,000 Hz DVD-Audio Quality • 192,000 Hz DVD-Audio Quality

Digitization: Quantization • Each of the discrete samples of amplitude values obtained from the sampling step are mapped and rounded to the nearest value on a scale of discrete levels. • The number of levels in the scale is expressed in bit depth--the power of 2. • More levels: more accurate mapping, better quality, but larger file size • Less levels: less accurate mapping, worse quality, but smaller file size • Bit depth of a digital audio is also referred to as resolution. • For digital audio, higher resolution means higher bit depth. • An 8-bit audio allows 28 = 256 possible levels in the scale • only use if some distortion is acceptable, e.g. voice communication • CD-quality audio is 16-bit (i.e., 216 = 65,536 possible levels)

Digital Audio File Size • File size of uncompressed digital audio is determined by: • Sampling rate (r); • Bit depth (s); • Number of channels; • Mono: single channel; • Stereo: two channels; • Multiple channels; • Duration of the audio in seconds (t);

Let's estimate the file size of a 1-minute CD-quality audio file

1-minute CD Qualtiy Audio • Sampling rate = 44100 Hz(i.e., 44,100 samples/second) • Bit depth = 16(i.e., 16 bits/sample) • Stereo(i.e., 2 channels: left and right channels)

File Size of 1-min CD-quality Audio • 1 minute = 60 seconds • Total number of samples= 60 seconds  44,100 samples/second= 2,646,000 samples • Total number of bits required for these many samples= 2,646,000 samples  16 bits/sample= 42,336,000 bitsThis is for one channel. • Total bits for two channels= 42,336,000 bits/channel  2 channels= 84,672,000 bits

File Size of 1-min CD-quality Audio 84,672,000 bits= 84,672,000 bits / (8 bits/byte)= 10,584,000 bytes= 10,584,000 bytes / (1024 bytes/KB) 10336 KB= 10336 KB / (1024 KB/MB)10 MB

General Strategies to Reduce Digital Media File Size • Reduce sampling rate • Reduce bit depth • Apply compression • For digital audio, these can also be options: • reducing the number of channels • shorten the length of the audio

Reduce Sampling Rate • Sacrifices the fidelity of the digitized audio • Need to weigh the quality against the file size • Need to consider: • human perception of the audio(e.g., How perceptibe is the audio with lower sampling rate?) • how the audio is used • music: may need higher sampling rate • short sound clips such as explosion and looping ambient background noise: may work well with lower sampling rate

Effect of Sampling Rate on File Size File size = duration sampling rate  bit depth  number of channels • File size is reduced in the same proportion as the reduction of the sampling rate • Example: Reducing the sampling rate from 44,100 Hz to 22,050 Hz will reduce the file size by half.

Effect of Bit Depth on File Size File size = duration  sampling rate bit depth  number of channels • File size is reduced in the same proportion as the reduction of the bit depth • Example: Reducing the bit depth from 16-bit to 8-bit will reduce the file size by half.

Most Common Choices of Bit Depth • 8-bit • usually sufficient for speech • in general, too low for music • 16-bit • minimal bit depth for music • 24-bit • 32-bit

Effect of Number of Channels on File Size File size = duration  sampling rate  bit depth number of channels • File size is reduced in the same proportion as the reduction of the number of channels • Example: Reducing the number of channels from 2 (stereo) to 1 (mono) will reduce the file size by half.

Digital Sound Editing • Software: Audacity (tutorial and hands on activity will be given in class). • Timeline divided into tracks • Sound on each track displayed as a waveform • 'Scrub' over part of a track e.g. to find pauses • Cut and paste, drag and drop • May combine many tracks from different recordings (mix-down)

Effects and Filters • Noise gate: remove hiss from music • Low pass and high pass filters • Notch filter: removes a single narrow frequency band • De-esser: removes the sibilance • Click repairer: removes clicks from recordings taken from old vinyl records • Reverb: echo effect • etc

Audio File Compression • Lossless • Lossy • gets rid of some data, but human perception is taken into consideration so that the data removed causes the least noticeable distortion • e.g. MP3 (good compression rate while preserving the perceivably high quality of the audio)

Compression • In general, lossy methods required because of complex and unpredictable nature of audio data • CD quality, stereo, 3-minute song requires over 25 Mbytes • Data rate exceeds bandwidth of dial-up Internet connection • Difference in the way we perceive sound and image means different approach from image compression is needed

Companding • Non-linear quantization • Higher quantization levels spaced further apart than lower ones • Quiet sounds represented in greater detail than loud ones

ADPCM • Differential Pulse Code Modulation • Similar to video inter-frame compression • Compute a predicted value for next sample, store the difference between prediction and actual value • Adaptive Differential Pulse Code Modulation • Dynamically vary step size used to store quantized differences

Perceptually-Based Compression • Identify and discard data that doesn't affect the perception of the signal • Needs a psycho-acousticalmodel, since ear and brain do not respond to sound waves in a simple way • Threshold of hearing – sounds too quiet to hear • Masking – sound obscured by some other sound

The Threshold of Hearing

Masking

Compression Algorithm • Split signal into bands of frequencies using filters • Commonly use 32 bands • Compute masking level for each band, based on its average value and a psycho-acoustical model • i.e. approximate masking curve by a single value for each band • Discard signal if it is below masking level • Otherwise quantize using the minimum number of bits that will mask quantization noise

MP3 • MPEG Audio, Layer 3 • Three layers of audio compression in MPEG-1 (MPEG-2 essentially identical) • Layer 1...Layer 3, encoding proces increases in complexity, data rate for same quality decreases • e.g. Same quality 192kbps at Layer 1, 128kbps at Layer 2, 64kbps at Layer 3 • 10:1 compression ratio at high quality

AAC • Advanced Audio Coding • Defined in MPEG-2 standard, extended and incorporated into MPEG-4 • Not backward compatible with earlier standards • Higher compression ratios and lower bit rates than MP3 • Subjectively better quality than MP3 at the same bit rate

Audio Formats • Platform-specific file formats • AIFF (mac), WAV (windows), AU (unix) • Multimedia formats used as 'container formats' for sound compressed with different codecs • QuickTime, Windows Media, RealAudio • MP3 has its own file format, but MP3 data can be included as audio tracks in QuickTime movies and SWFs

MIDI • Musical Instruments Digital Interface • Instructions about how to produce music, which can be interpreted by suitable hardware and/or software • cf. vector graphics as drawing instructions • Standard protocol for communicating between electronic instruments (synthesizers, samplers, drum machines) • Allows instruments to be controlled by hardware or software sequencers

MIDI and Computers • MIDI interface allows computer to send MIDI data to instruments • Store MIDI sequences in files, exchange them between computers, incorporate into multimedia • Computer can synthesize sounds on a sound card, or play back samples from disk in response to MIDI instructions • Computer becomes primitive musical instrument (quality of sound inferior to dedicated instruments)

MIDI Messages • Instructions that control some aspect of the performance of an instrument • Status byte – indicates type of message • 2 data bytes – values of parameters • e.g. Note On + note number (0..127) + key velocity • Running status – omit status byte if it is the same as preceding one

Common Audio File Types

Choosing an Audio File Type Determined by the intended use • File size limitation • Intended audience • Whether as a source file

File Size Limitations • Is your audio used on the Web? • file types that offer high compression • streaming audio file types

Digital Sound

Digital Sound

Presentation Transcript

Week 6 – Digital Sound

Week 9 – Digital Sound

Managing Digital Sound

Week 6 – Digital Sound

Digital Audio Sound System

Sound Synthesis With Digital Waveguides

Digital Audio Turning sound into bits, and bits back into sound

L 19 Electronic Sound- Analog and Digital

Digital Sound as Computer Science

DIGITAL SOUND

Digital Sound

Audacity Digital sound recording

DIGITAL SOUND I

Digital Sound

Sound and Digital Audio

Sound and Digital Audio

Lecture 6 : DIGITAL sound technology

Digital Audio Turning sound into bits, and bits back into sound

Screen Recorder With Sound – Digital Screencast