380 likes | 715 Views
Graphics 2 & Sound The First Rule of Computer Graphics is: Compress Last Take a picture or scan an image at the highest possible resolution. Work on that image at that resolution. Save the image in its original and compressed formats.
E N D
Graphics 2 & Sound CSE5900 -- Intro to MM Computing -- Lecture 3
The First Rule of Computer Graphics is: • Compress Last • Take a picture or scan an image at the highest possible resolution. • Work on that image at that resolution. • Save the image in its original and compressed formats. • Contemporary scanners = 1,200 x 1,200 dots per inch; 32 bit (4 byte) colour, so we have • 1200 * 1200 *4 * (say) 7 * 5 = 201,000,000 bytes • Hummm. We’re going to need LOTS of RAM. (And hard disk space) • (This is why digital cameras are still ‘low resolution’, even with, say, 4,000,000 pixels per image.) CSE5900 -- Intro to MM Computing -- Lecture 3
Oh, Dear • If we are working with 200mb files, we have the following problems. • We can’t actually see the detail without zooming in a lot. • It will take a fair amount of time to get information on to and off of the hard disk. • It will take time to get the image to and from: • Disk and RAM (video RAM or standard RAM) • RAM and CPU • CPU to video card • Video card to monitor • Much of the contemporary evolution in computing tries to improve the bandwidth of these connections CSE5900 -- Intro to MM Computing -- Lecture 3
Crucial Definitions • Bandwidth • The rate at which data moves from point A to point b. The throughput of a connection, measured in either bits or bytes. • Bus • The physical connection(s) between point A and point B • Busses come in two flavours • Serial - one signal wire, so 1 bit at a time. • COM ports • USB (universal serial bus) • Parallel - many signal wires (8, 16, 34 or 64 at present), many bits at the same time. • PCI bus • Northbridge or front end bus CSE5900 -- Intro to MM Computing -- Lecture 3
Busses, Words, and Addresses • We can speak of the capacity of four things in terms of the numbers we are discussing: • Bus capacity, bits or wires • Word length: the number of bits a CPU deals with as a unit. Currently usually 64 bits, and moving towards 128. • Maximum address size: the highest location address (for RAM, hard disk, etc) measured in bits. • Operating system: The size, in bits, of an instruction. Note that the word size has to be a multiple of the instruction size CSE5900 -- Intro to MM Computing -- Lecture 3
In the Beginning (1983) RAM CPU Key- board Floppy Printer Screen CSE5900 -- Intro to MM Computing -- Lecture 3
A Little Later (1985) RAM Video Card Video RAM ALU - Floating point Screen CPU Key- board Floppy Hard Disk CSE5900 -- Intro to MM Computing -- Lecture 3
Another View Mouse Modem Printer LPT2 COM1 Floppy Hard Disk Screen COM2 LPT1 CPU RAM IO Card Video Card Disk Controller Bus CSE5900 -- Intro to MM Computing -- Lecture 3
Windows 2 Arrives (1992) RAM ALU - Floating point CPU Key- board Floppy Video RAM Mouse Hard Disk Video Card Screen CSE5900 -- Intro to MM Computing -- Lecture 3
Yesterday CPU Video RAM Video RAM Video Card Games--3D Video Card Windows--2D Screen CSE5900 -- Intro to MM Computing -- Lecture 3
Today • RAM CPU Video RAM Video Card Windows--2D Games--3D Screen CSE5900 -- Intro to MM Computing -- Lecture 3
Today - A Second Perspective Video RAM Video Card RAM North Bridge CPU Cache Memory South Bridge Disk Controller Sound Card Other PCI Cards CSE5900 -- Intro to MM Computing -- Lecture 3
Tomorrow Video RAM Video Card Rendering Physics, etc. RAM North Bridge Cache Memory CPU Cache Memory CPU South Bridge Disk Controller Sound Card Other PCI Cards CSE5900 -- Intro to MM Computing -- Lecture 3
RAM to Screen Video RAM (Frame Buffer) Screen Pixels • A pixel has a matching chunk of video RAM CSE5900 -- Intro to MM Computing -- Lecture 3
The Conversion Step • But RAM is digital and video is analog, so we need a digital to analog converter (DAC) Video (composite) RAM (rgb) RAM DAC CSE5900 -- Intro to MM Computing -- Lecture 3
Colour Information Storage • 0 to 255 for red 8 bits • 0 to 255 for blue 8 bits • 0 to 255 for green 8 bits • 0 to 255 for Alpha 8 bits • So 32 bits (4 bytes) per pixel • 1,024 x 768 x 4 = 3,145,728 bytes per frame in video memory • So 16mb video memory may be ok, where the image is “pre-cooked” (my jargon) • Assuming there’s enough bandwidth for 24 frames/second CSE5900 -- Intro to MM Computing -- Lecture 3
Pre-Cooked versus Home Cooked • Pre-Cooked • All of the images are prepared in advance, and are on disk or CD-ROM • The processing steps are: • Read from hard disk into CPU • Decompress compressed images • Build the frame in video RAM • Run through RAMDAC onto screen • The limits are: • The bandwidths, especially for video • The file sizes versus storage capacities • The complexity of decompression • Everything has to be completely prepared in advance. It’s all finished art CSE5900 -- Intro to MM Computing -- Lecture 3
Pre-Cooked versus Home Cooked, 2 • Home Cooked • Where you can’t prepare all images in advance • First person perspective games, for example • Player can move in any direction, look up, down and around, explore • The display has to be created on the fly • Curse you, Doom, for starting it all! • Pre-cooked is sometimes called “2D” and home cooked is sometimes called “3D”, which is completely misleading. • The difference is what’s done in advance or in real time • Traditional video cards could do interactive 2D (and pre-cooked 3D) • 3D accelerator cards can do interactive 3D image creation in real time. CSE5900 -- Intro to MM Computing -- Lecture 3
Digression 1: APIs • API = application programmer’s interface • A large set of functions used to extend a standard programming language into a specialized area • These functions can be used on the fly as needed • There are 2 standard 3D graphics APIs • OpenGL (originally from Silicon Graphics, now non-proprietary, used for Doom) • Direct3D (from Microsoft, part of DirectX) • Allow work (by games programmers) at • Low level -- set up scene using graphic primitives • High level -- make small variations to a scene -- think of sprites and larger elements CSE5900 -- Intro to MM Computing -- Lecture 3
Digression 2: Building a 3D Image • We have to come up with a 2D screen rendition of a 3D thing which we’ve built on a 2D screen-based workspace • First comes the wire frame models of the object. This is all that’s really there, and the rest is just processing. • With the pre-cooked, all the processing takes place in advance and the results (not the wire frames) are distributed. • With home cooking, the wire frame models are distributed, together with textures and materials, and the processing takes place in real time in the 3D accelerator usually using a game engine created using a API (Direct3D or OpenGL) CSE5900 -- Intro to MM Computing -- Lecture 3
Digression 2: Building a 3D Image, 2 • There has to be a camera position from which we look at the wire frames. • We have to get rid of things behind other things (clipping) • We have to provide light sources, with colour mixes, intensities and directions, resulting in shadows, reflections, etc. (ray tracing, usually faked in games) • We have to apply different textures (e.g., red gum) to the surfaces of the wire frames (texture mapping) • We have to smooth everything out according to all the above (rendering) • The 3D accelerator cards provide the tools to do this kind of stuff in real time CSE5900 -- Intro to MM Computing -- Lecture 3
How Well Do They Work? • Consider that each frame in an animated film takes a network of computers many minutes to render. • Consider that the wire frames for animated films have hundreds of thousands or millions of polygons (the primitive used for processing) • Consider that the audience for 3D accelerators is 16 year old boys who haven’t graduated from killing things (a low resolution occupation) to sex (a high resolution preoccupation) • The cards are getting good at keeping their intended audience happy • But they are miles from being able to do Riven in real time CSE5900 -- Intro to MM Computing -- Lecture 3
Music Hath Charms to Sooth the Savage Beast Introduction to Sound Processing CSE5900 -- Intro to MM Computing -- Lecture 3
Sound Is Analog • So there’s infinite variation • Like a rock thrown into a pond, there are waves: • Amplitude: how high the waves are -- Loudness • Frequency: how many waves per second -- Pitch • Loudness is measured in decibels • This is a log scale, so 20 is ten times as loud as 10, 30 is ten times as loud as 20, and so forth. • You can distinguish from just over 0 dB to 120dB • 37 quiet office (no air-conditioning) • 59 conversation • 76 loud factory • 110 really loud night club or rave • 140 threshold of pain (well, for some) CSE5900 -- Intro to MM Computing -- Lecture 3
Sound Is Analog, 2 • Pitch is measured in Hz (Hertz, cycles per second) and kHz. • You can distinguish between a few Hz and 15 - 20 khz (this is age dependent) • Lowest note on piano 27 Hz • Highest note on piano 4,186 Hz (4.186 kHz) • Lowest vocal sound 80 Hz • Highest vocal sound 800 Hz • The A above middle C 440 Hz (used to be lower!) • But a sine wave just at these frequencies sounds sterile: it lacks the overtones, the harmonics, produced by all natural sources of sound. Think of the FM synthesis sound of a cheap sound card. CSE5900 -- Intro to MM Computing -- Lecture 3
Sound Is Analog, 3 • An instrument vibrating produces lots of sounds above the fundamental tone. • Many of these are various octaves above the fundamental • Octave = double the frequency • To get realistic sound we have to pick up at least the 4th harmonic, 4x the frequency of the fundamental. • So we have to pick up to 12kHz for, say a realistic flute sound (where the highest fundamental is just under 4kHz • More is better until, say 20kHz where a 5 year old’s hearing cuts out. CSE5900 -- Intro to MM Computing -- Lecture 3
Sound Into Bits (ADC) • Something that always confuses me: • The 16 bits used (0 - 64k) record the amplitude (loudness) • The differences between successive 16 bit samples contain the frequency (pitch) • Remember, a high wave also has a trough between peaks. • So how often do we have to sample to get enough samples • 2x the maximum difference we want to catch. • And we want to catch differences up to 20mHz, so we have to sample at at least 40,000 times a second. • So your CDs contain music sampled at just over 44,000 times a second. • So the digital signal bandwidth must be 16 x 44,000 = 704,000 bits per second or 88 kbps. With stereo sound, we have to have two such samples, so a 1x CD-ROM bus goes at 176 kbps, which we already knew! CSE5900 -- Intro to MM Computing -- Lecture 3
Bits Into Sound (DAC) • Amplifiers and speakers are analog devices, so • The CD player does DAC and passes the results as an analog signal to your stereo system. • It does the same if you listen to music off your CD-ROM drive. • But where does your sound card do the conversion? • Hummmm…. Later is (far, far) better, because there’s lots of electrical interference inside your PC. Digital isn’t affected by this, but analog is. • So the perfect system would be all digital inside the computer and have its DAC inside the speakers • The USB promises to make this real, and Microsoft makes speakers which work this way. CSE5900 -- Intro to MM Computing -- Lecture 3
MIDI & General MIDI • MIDI = Music Instrument Digital Interface • MIDI is to sampling exactly what vector is to raster graphics • A language for describing sounds • The notes • The instruments, each of which has a number • 128 instruments • Plus drum kit • The note characteristics • attack • sustain • decay • release • 2+ ways of making those notes • FM synthesis • Wavetable CSE5900 -- Intro to MM Computing -- Lecture 3
The Parts of a MIDI Note • From the MIDI Manufacturers Homepage CSE5900 -- Intro to MM Computing -- Lecture 3
Making MIDI • FM Synthesis • Sterile sine waves • What gave computer music a bad name • Wavetable Sound Generation • The music gives the number of the instrument • Samples of the sound of that instrument are stored in ROM/RAM on the sound card • The samples are processed to give a far better illusion of the sound of the instrument • The more samples, the better, so 4mb of samples on ROM are better than 512k. • Wavetables may also be downloaded from CD-ROMS • Wavetables can be purchased as daughterboards for better sound cards. CSE5900 -- Intro to MM Computing -- Lecture 3
MIDI Quality • Well, as always there’s the trade off: • Much smaller file size • Always somewhat less quality • Infinitely cheaper to create -- only one muso necessary • May require significant CPU processing CSE5900 -- Intro to MM Computing -- Lecture 3
Channels, Voices and Streams • A channel drives a speaker: • 2 channels for standard stereo • 4 channels for 3D sound (two may be faked) • 8+ channels for super sound in theatre movies • A voice is an instrument, etc. on a channel • MIDI supports a large number of voices: 32, 64. This is polyphony • The voices are superimposed, in digital or analog form, and then sent to the speakers • Again, multiple voices may load down the CPU • A stream is half voice and half channel • Lets record a sound effect, a stream • When we need it, we superimpose it on top of the sound going to a channel • The sound card and/or CPU do the work CSE5900 -- Intro to MM Computing -- Lecture 3
Channels, Voices and Streams, 2 • The higher the bandwidth into the sound card • The more channels, voices and streams we can get at once • And the more processing work has to be done • So we either do more on-sound-card processing or bog down the CPU • (Sound like the issues related to 3D accelerator cards?!) • The 16 bit ISA bus worked fine for sound until we decided to want more channels, voices, streams, wavetables, etc. (But they still flourish: somebody tried to sell me one yesterday) • Newer cards use the PCI bus, and make use of it. • And use a newer generation of sound chips. • But the evolution here is relatively slow! CSE5900 -- Intro to MM Computing -- Lecture 3
Games and Computer Sound • Games are one of several factors driving the evolution of graphics boards • Games are almost the only factor driving the evolution of sound cards • Who is sneaking up behind me? We need 3D. • What kind of sound does that alien make when exploded? We need lots of streams superimposed. • 3D illusion • Uses 3 speakers (woofer, + 2 satellite) and an algorithm to fox the ear by marginally delaying one stereo channel • Developed by NASA for space flight simulators • Can work well if don’t move your head • With 5 speakers, esp. with 4 channels, can work very well indeed • As the musicians are never behind you, not necessary for music. Whoops, sorry Berlioz, Allegri, Tchkovsky, etc. CSE5900 -- Intro to MM Computing -- Lecture 3
Games and Computer Sound, 2 • Competing 3D positional audio standards • A3D • From Aureal Semiconductor (now defunct) • Now a non-runner • Audio Extensions .EAX • From Creative Labs (who brought us Sound Blaster) • Bought resources of Aureal (A3D) • Included A3D technology in .EAX • In Nomad, Zen and other sound card brands • DirectSound3D • From Microsoft • Part of the DirectX set of Windows APIs/extensions, including Direct3D • CRL Sensaura • Bought by Creative Technology in 2003 • Used in Xbox, PlayStation2, Nintendo GameCube, PCs CSE5900 -- Intro to MM Computing -- Lecture 3
Digital Signal Processing • Now most music processing is in analog form on card. • Far better if it were all digital, with DAC and amplification on the speakers • Need DAE: Digital Audio Extraction, from CD/CD-ROM • Introduced with 24x CD-ROMs. • Theory is that we use USB to get the digital sound to the speakers • Speakers with USB connections now also have standard audio connections for the sounds and use USB for controlling the settings • Amplifiers are both on the sound card and external. CSE5900 -- Intro to MM Computing -- Lecture 3
MP3 • MPEG-1, Layer 3 sound compression -- intended for movies on CD and DVD • 90+% compression possible • Compression takes about a third of the length of the music • A typical song (50mb) goes to 5mb • Is a lossy compression, so the quality goes down, but not much • No encryption in any way • No “watermark” (watermark = a secret pattern of bits somewhere which indicates the source of the copy) • Much music publisher panic with the popularity of the format. • Much more music publisher panic with the IPod, which can store an entire music library in MP3 formt CSE5900 -- Intro to MM Computing -- Lecture 3