160 likes | 177 Views
This session explores the concepts of compression in multimedia, including images, audio, and transmission. Topics include lossy and lossless compression, compression formats such as JPEG and GIF, and streaming protocols. The session also covers representation and recognition in multimedia.
E N D
Multimedia Class 9 LBSC 690 Information Technology
Agenda • Questions • Compression • Images • Audio • Transmission • Representation & Recognition
Project • Today: Project descriptions • In 2 weeks: Specification • Guidelines on the web site • One week after that: Test Plan • Tracability matrix from specification to tests • Specification is not cast in concrete • But the test plan must match it exactly • And any changes must be reflected in both
Presentations • Typical setup • One presenter up front • Slide flipper at any computer • Ray will display whatever monitor you want • Five minutes each - ruthlessly timed! • Plus one minute for questions • Not graded • Seek to share ideas and learn!
Basic Image Coding • Raster of picture elements (pixels) • Each pixel has a “color” • Binary - black/white (1 bit) • Grayscale (8 bits) • Color (each pixel has three dots) • Red, green, blue • Screen • A 1024x768 image requires 2.4 MB • So a picture is worth 400,000 words!
Some Questions • Use this to answer: • How many images can a 2 GB hard drive store? • Nobody would use images! • How long does it take to send one by modem? • Imagine how slowly web pages would load • But real images don’t have these problems • How do we get around these problems?
Compression • General goal: reduce redundancy • Send the same information using fewer bits • more telephone calls in one cable • more faxes per minute • more images stored per disk • better quality video images • Two basic strategies: • Lossless, Lossy • The two strategies can be combined
Lossy Compression • Example - Palette selection • No picture uses all 16 mission colors • Select a palette of 256 colors • Can represent with 1byte instead of 3 • Then look up each color in the palette • JPEG (.jpg) • Standard lossy compression for images • Eliminates detail that’s not seen by humans • Uses frequency representation
Lossless Compression • Run Length Encoding (RLE) • Pixels are organized into lines • Most pixels are the same as the one before • That can be coded in 1 bit (1/24 the space) • Smaller files take less time to transmit • GIF (.gif) • Standard lossless compression format
Moving Images • One image frame is much like the next • An additional source of redundancy • MPEG-1 (.mpg) can handle small screens • Compression requires extensive computation • Special purpose hardware needed to run in real-time • Pentium processors can decode it • MPEG-2 is needed for full-screen video • Not yet widely used by computers • Try video from learn.umd.edu site
Audio • In most cases, people care more about high quality audio than high quality images • Sample at twice the highest frequency • One or two bytes per sample • Voice (0-4 kHz) requires 8 kB/s • Music (0-22kHz) requires 44 kB/s • Compression strategies • Lossy is pretty good for voice • Only some of the frequencies are actually used • Lossless is better for music
Transmission • MIME • Attach a standard format to message • Formats include .gif, .jpg, .mpg, and .au • Messages include email and web pages • The whole file is sent first (downloaded), then played
Streaming Audio and Video • Streaming protocols • Replay starts almost immediately • RealVideo has emerged as the standard • Streaming video challenges • Sent in small packets • Sometimes arrive out of order • Compensate by storing some in a buffer • Introduces a delay • Modems carry audio better than video • Video data requires high “bandwidth” • Real Video compensates with lower frame rates
Representation and Recognition • Semantic representations versus pixels • Good representations can lead to good recognition • Speech Recognition • Phonemes, linguistics • Image Recognition • Objects • Movies • Thematic structure
Summary • Compression is needed to make multimedia manageable. • More “semantic” representations are now possible because of more computer power at the received (client). • Next week usability and user interfaces