240 likes | 519 Views
Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing Some Sources Used Richard E Berg: Physics 102: PHYSICS OF MUSIC , University of Maryland Robert Jourdain (1997). Music, the Brain and Ecstasy . Quill. Various bits of Wikipedia Dolby Sound Sound Is Analog
E N D
Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing CSE5060 -- Multimedia on the Web -- Lecture11
Some Sources Used • Richard E Berg: Physics 102: PHYSICS OF MUSIC, University of Maryland • Robert Jourdain (1997). Music, the Brain and Ecstasy. Quill. • Various bits of Wikipedia • Dolby Sound CSE5060 -- Multimedia on the Web -- Lecture11
Sound Is Analog • So there’s infinite variation • Like a rock thrown into a pond, there are waves: • Amplitude: how high the waves are -- Loudness • Frequency: how many waves per second -- Pitch • Loudness is measured in decibels • This is a log scale, so 20 is ten times as loud as 10, 30 is ten times as loud as 20, and so forth. • You can distinguish from just over 0 dB to 120dB • 37 quiet office (no air-conditioning) • 59 conversation • 76 loud factory • 110 really loud night club or rave • 140 threshold of pain (well, for some) CSE5060 -- Multimedia on the Web -- Lecture11
Sound Is Analog, 2 • Pitch is measured in Hz (Hertz, cycles per second) and kHz. • You can distinguish between a few Hz and 15 - 20 khz (this is age dependent) • Lowest note on piano 27 Hz • Highest note on piano 4,186 Hz (4.186 kHz) • Lowest vocal sound 80 Hz • Highest vocal sound 800 Hz • The A above middle C 440 Hz (used to be lower!) • But just a sine wave at these frequencies sounds sterile: it lacks the overtones, the harmonics, produced by all natural sources of sound. CSE5060 -- Multimedia on the Web -- Lecture11
This is a sine wave, which may represent a “pure” (and so artificial) sound Its frequency (tone) is the distance between crests -- hertz Its amplitude (loudness) is the height of the crests -- decibels With frequent sampling we can capture both frequency and amplitude in a single series of numbers Any sound can be reproduced using a sequence of overlaid sine waves (Fourier transformations) Two sine waves interacting CSE5060 -- Multimedia on the Web -- Lecture11
Notice how Complicated the Vibrations CSE5060 -- Multimedia on the Web -- Lecture11
Sound Is Analog, 3 • An instrument vibrating produces lots of sounds above the fundamental tone. • Many of these are various octaves above the fundamental • Octave = double the frequency • To get realistic sound we have to pick up at least the 4th harmonic, 4x the frequency of the fundamental. • So we have to pick up to 12kHz for, say a realistic flute sound (where the highest fundamental is just under 4kHz • More is better until, say 20kHz where a 5 year old’s hearing cuts out. • So how frequently do we need to sample to get “realistic” sounds? CSE5060 -- Multimedia on the Web -- Lecture11
Poor Sampling Rate: Graphic CSE5060 -- Multimedia on the Web -- Lecture11
Nyquist-Shannon Sampling Theorem • You have to sample at 2x the size of the smallest difference you want to catch. • Remember, we are sampling the volume (loudness) of a sound consisting of lots of superimposed fundamental and harmonic frequencies. • So there are 44,100 samples per second, each a 2 byte --16 bit between 0 and 64k • Sampling Demo Program CSE5060 -- Multimedia on the Web -- Lecture11
MP3 • MPEG-1, Layer 3 sound compression -- intended for movies on CD and DVD • 90+% compression possible • A typical song (50mb) goes to 5mb • Is a lossy compression, so the quality goes down • No encryption in any way • No “watermark” (watermark = a secret pattern of bits somewhere which indicates the source of the copy) • Much music publisher panic with the sudden popularity of the format. • Much more music publisher panic with the iPod and friends CSE5060 -- Multimedia on the Web -- Lecture11
MP3 Sound Isn’t Very Good… • Having failed at my attempt to demonstrate how bad MP3 is in a tute • I will now (fail again?) demonstrate the loss of quality in MP3 yet again… • Roll it, monks….. CSE5060 -- Multimedia on the Web -- Lecture11
Sound Into Bits (ADC) • Something that always confuses me: • The 16 bits used (0 - 64k) record the amplitude (loudness) • The differences between successive 16 bit samples contain the frequency (pitch) • Remember, a high wave also has a trough between peaks. • So how often do we have to sample to get enough samples • 2x the maximum difference we want to catch. • And we want to catch differences up to 20mHz, so we have to sample at at least 40,000 times a second. • So your CDs contain music sampled at just over 44,000 times a second. • So the digital signal bandwidth must be 16 x 44,000 = 704,000 bits per second or 88 kbps. With stereo sound, we have to have two such samples, so a 1x CD-ROM bus goes at 176 kbps, which we already knew! CSE5060 -- Multimedia on the Web -- Lecture11
Bits Into Sound (DAC) • Amplifiers and speakers are analog devices, so • The CD player does DAC and passes the results as an analog signal to your stereo system. • It does the same if you listen to music off your CD-ROM drive. • But where does your sound card/chip do the conversion? • Hummmm…. Later is (far, far) better, because there’s lots of electrical interference inside your PC. Digital isn’t affected by this, but analog is. • So the perfect system would be all digital inside the computer and have its DAC inside the speakers CSE5060 -- Multimedia on the Web -- Lecture11
So We Have DAC for both Sound and Video • At the same time • By two independent sets of hardware and software • Working on two independent files • How can we guarantee synchronisation??? • This is a problem with Flash! CSE5060 -- Multimedia on the Web -- Lecture11
MIDI & General MIDI • MIDI = Music Instrument Digital Interface • MIDI is to sampling exactly what vector is to raster graphics • A language for describing sounds • The notes • The instruments, each of which has a number • 128 instruments • Plus drum kit • The note characteristics • attack • sustain • decay • release • 2+ ways of making those notes • FM synthesis • Wavetable CSE5060 -- Multimedia on the Web -- Lecture11
An MDI Studio CSE5060 -- Multimedia on the Web -- Lecture11
An Audioacoustic Editing Lab CSE5060 -- Multimedia on the Web -- Lecture11
The Parts of a MIDI Note • From the MIDI Manufacturers Homepage CSE5060 -- Multimedia on the Web -- Lecture11
Making MIDI • FM Synthesis • Sterile sine waves • What gave computer music a bad name • Wavetable Sound Generation • The music gives the number of the instrument • Samples of the sound of that instrument are stored in ROM/RAM on the sound card • The samples are processed to give a far better illusion of the sound of the instrument • The more samples, the better, so 64mb of samples on ROM are better than 512k. • Wavetables may also be downloaded from CD-ROMS CSE5060 -- Multimedia on the Web -- Lecture11
MIDI Quality • Well, as always there’s the trade off: • Much smaller file size • Always somewhat less quality • Infinitely cheaper to create -- only one muso necessary • May require significant CPU processing CSE5060 -- Multimedia on the Web -- Lecture11
Channels, Voices and Streams • A channel drives a speaker: • 2 channels for standard stereo • 4-5 channels for 3D sound (two may be faked) • 8+ channels for super sound in theatre movies • A voice is an instrument, etc. on a channel • MIDI supports a large number of voices: 32, 64. This is polyphony • The voices are superimposed, in digital or analog form, and then sent to the speakers • Again, multiple voices may load down the CPU • A stream is half voice and half channel • Lets you record a sound effect, a stream • When we need it, we superimpose it on top of the sound going to a channel • The sound card and/or CPU do the work CSE5060 -- Multimedia on the Web -- Lecture11
Channels, Voices and Streams, 2 • The higher the bandwidth into the sound card/chip • The more channels, voices and streams we can get at once • And the more processing work has to be done • So we either do more on-sound-card/chip processing or bog down the CPU • (Sound like the issues related to 3D accelerator cards?!) • Evolution in sound cards/chips rather slow • Most systems use sound chip on motherboard • But if you want to play games…. CSE5060 -- Multimedia on the Web -- Lecture11
Games and Computer Sound • Games are one of several factors driving the evolution of graphics boards • Games are almost the only factor driving the evolution of sound cards • Who is sneaking up behind me? We need 3D. • What kind of sound does that alien make when exploded? We need lots of streams superimposed. • 3D illusion • Uses 3 speakers (woofer, + 2 satellite) and an algorithm to fox the ear by marginally delaying one stereo channel • Developed by NASA for space flight simulators • Can work well if don’t move your head • With 5 speakers, esp. with 4 channels, can work very well indeed • Note: deep tones non-directional, so we can use just one woofer • As the musicians are never behind you, not necessary for music. Whoops, sorry Berlioz, Allegri, Tchkovsky, etc. CSE5060 -- Multimedia on the Web -- Lecture11
Games and Computer Sound, 2 • Competing 3D positional audio standards • A3D • From Aureal Semiconductor • On their widely used Vortex audio chips • Audio Extensions .EAX • From Creative Labs (who brought us Sound Blaster) • DirectSound3D • From Microsoft • Part of the DirectX set of Windows APIs/extensions, including Direct3D • Currently the first of these is the standard, but watch out for the rest. CSE5060 -- Multimedia on the Web -- Lecture11