430 likes | 568 Views
M ul t ime d ia I nformation: R epresentation, M anipulation & M anagement. I502. Balance. Balance between quality and efficiency Quality = accuracy or integrity Efficiency = reasonable choices for storage and access. Balance.
E N D
Multimedia Information: Representation, Manipulation & Management I502
Balance • Balance between quality and efficiency • Quality = accuracy or integrity • Efficiency = reasonable choices for storage and access
Balance • Images - quality is dependent on spatial resolution and bit-depth • Sound - sampling rate and no. of bits per sample • Moving image - bit-depth, spatial resolution, frame-rate, and sound quality • Compression influential!
Images • Two types: Raster and vector • A vector image file consists of information about lines & shapes • Description of “locations” where points need to be drawn (in terms of x,y coordinate values). • Example, image produced as an output of CAD/CAM software
Images • Raster images are described in terms of pixel values (total number of pixels - width x height) and color values of specific pixels (bit-depth) • Example, captured using scanners or produced by digital cameras
Vector Image Example of a vector image: www.rff.com
Raster Image Example of a raster image: www.battelle.org/healthcare/ html/gfx/content/idteam.jpg
A Sense of Depth • For black/white one bit for defining each cell “color” may be sufficient • But to achieve higher quality in images, for example shades of gray or to capture/store color, each pixel must be associated with more than one bit • This is commonly called “bit-depth”
Bit-depth: Monochrome or 1-bit dx.sheridan.com
Example: Grayscale – 8-bit dx.sheridan.com
Example: Color – 8-bit dx.sheridan.com
Images Storage Fact: 20,000,000 bytes needed to store an average book in digitized (image) format
Images • GIF - nonlossy compression (on 256 variations) • 3:1 compression ratio • PNG - nonlossy compression • 5% to 25% better than GIF • JPEG (baseline) - lossy • as much as 40:1
Audio • The process of capturing sound involves converting analogue (continuous) signal to digital (discrete) signal • It is called sampling
Audio • The analogue signal is “broken” up into samples • For each sample a value corresponding to the “wave amplitude” is measured (dynamic range) and stored • The dynamic range is stored using n bits/sample
Audio • The number of samples/second together with bits/sample therefore determine the quality of the digitized sound • CD sound - 44.1 KHz & 16 bits/sample- is used as the benchmark • Sample rate - 44,100 times / second • Dynamic range - 65,536 values (16 bits)
Audio • Most sound cards support the CD quality sound capture • At the highest quality, for: • Each second = 176 Kilo bytes storage • Each minute = 10.5 Mega bytes storage • Each hour = 630 Mega bytes storage
Audio Storage Fact: A CD can store about 650,000,000 bytes of data
Audio • Developers can also use Apple’s QuickTime for storing high quality sound • (recognized by both popular browsers) • MIDI - musical instrument digital interface is similar to “Postscript” in that it does not store sound but instructions to synthesizers to produce sound
Audio • RealAudio - uses “streaming” technology for sending FM stereo quality sound at 28.8 Kbps or near CD quality at ISDN (64 Kbps) • Offers different sample-rate compression - low (AM), medium (FM), and high (CD)
Audio • MP3 - MPEG audio layer 3 • MPEG - Motion Pictures Expert’s Group • It is a sound compression and storage format
Audio • Advantages of MP3: • compresses regular CD quality sound files usually at 12:1 ratio • One minute of CD sound generally in MP3 would take up one mega byte (as opposed to 10+) • A typical 4 minute song = 3.5-5 Mbytes in MP3 • MP3 file (like TIFF) allows the creator to include annotations: text (e.g., artist’s name), graphics (e.g., cover art) and URL
Video • Frame - Capturing video involves transforming lines of the video signal per frame until a full frame is formed • Motion - many frames are captured in each second
Video • Generally, about 500 lines are captured in each frame and each line has 640 pixels • Also, for “realistic” motion 30 frames are captured and stored in each second
Video • Relies on different types of compression-decompression (codec) algorithms for storage
Video • MPEG-level2 • Allows 720 x 480 & 30 fps up to 1280 x 720 & 60 fps compression/decompression • Supports variable compression like JPEG - generally about 20:1 maximum • Supports “picture-in-picture”
Video • Cinepak • Achieves compression by reducing the frame size to 320 x 240 and frame rate to 15 fps
Video Storage Fact: 10,000,000,000 bytes in a digitized movie in compressed format; 17,000,000,000 can be stored on a DVD
Compression • Two types: Lossy versus Lossless • Lossy compression sacrifices certain information -- is not reversible • Lossless compression does not sacrifice any information -- reversible
Compression • Several lossless compression for text: • UNIX compress (LZ- Lempel & Ziv) • LZW (Lemple, Ziv & Welch) • RLE (Run Length Encoding) • Above generally achieve 30% reduction, about 3:1 • Group IV can achieve as much as 15:1
Manipulating Multimedia • High level programming languages offer API or code libraries to manipulate multimedia • JAVA provides numerous classes as part of its API library to manipulate multimedia content
Manipulating Images • For an example of simple image manipulation, see: http://xtasy.slis.indiana.edu/jmdocs/java/LoadImageAndScale.html • The Code for the above example is here: • http://xtasy.slis.indiana.edu/jmdocs/java/LoadImageAndScale.txt • More advanced image manipulation possible using: JAVA Advanced Imaging API (discussed later)
Manipulating Audio & Video • Audio can be manipulated in various ways, for example play, stop, loop, etc. • An example: • http://lair.indiana.edu/courses/i502/code/LoadAudioAndPlay.html • The code can be viewed here: • http://lair.indiana.edu/courses/i502/code/LoadAudioAndPlay.txt • Java also allows manipulation of video using the Java Media Framework (JMF) Library • http://java.sun.com/products/java-media/jmf/2.1.1/samples/
Project ViewFinder • It is a project aimed at developing search and browse functions for online “movie” information • The system uses both “key frames” and “text clues” associated with movies
Searching and Browsing in ViewFinder Frames Search by Descriptors Play Detail Promote
Addition Vector Similarity Search to ViewFinder • We are attempting to generate “feature” vectors for video frames so that we can implement similarity searches • One problem is that images may have many types of “information” embedded in them • At the most basic level we are starting with color information • Java’s Advance Imaging API provides modules to generate color histogram information
Color Histograms • Given an image: • One can produce a vector of pixel counts with particular color shades ranging from 0-255 for each of the three basic colors: red, green, and blue 238 0 0 0 0 0 0 0 238 0 0 19278 476 23562 0 238 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...0 0 0 0 0 0 0 0 0 0 0 0 0 238 0 0 0 0 0 0 19040 476 0 0 238 238 0 0 0 0 0 0 0 0 ...
Inner-Product Similarity Averages • The Red, Green, and Blue vector inner-products can be added and an average can be calculated based on the overall sum • The result is an image-frame by image-frame similarity matrix which we are attempting to exploit to generate image associations and image clusters • An explanation of this approach can be found here: • http://ella.slis.indiana.edu/~daalbert/lair/jai_tutorial/
Relational DB Support for MM • Different products support different fields • Varying length types • VARCHAR • BLOB • TEXT • IMAGE • CHARACTER VARYING • VARGRAPHIC • LONG RAW • BYTE VARYING • Often directory path to file system is stored (or a URL)
Case Study of RDBMS • Plexus XDP Imaging DB • Based on INFORMIX Turbo RDBMS • Supports a data type called IMAGE -> up to 2 GB • Supports direct manipulation of disk volumes instead of storing OS directory paths • Volume = platter • Family = a collection of volumes
Case Study: Plexus • A specific storage area, i.e., a family can be assigned to each IMAGE column • SQL in Plexus • CREATE TABLE Pages • ( PAGE_Number Integer DOCUMENT_Number Integer PAGE_Image IMAGE in ImageFamily ) IN CompoundDocumentFamily