340 likes | 564 Views
Chapter 3. Data Formats. Data Conversion. Types of Data. Alphanumeric Numeric Image Sound Video Graphics and Fonts All MUST be converted to binary form before they can be used by the computer. Proprietary Formats. Used by individual programs Unique: cannot be read by other programs
E N D
Chapter 3 Data Formats
Types of Data • Alphanumeric • Numeric • Image • Sound • Video • Graphics and Fonts All MUST be converted to binary form before they can be used by the computer.
Proprietary Formats • Used by individual programs • Unique: cannot be read by other programs • Example - a Microsoft WORD document is different from a WordPerfect document even through they are both word processing documents • As more info is shared over the Web, new proprietary formats are less desirable.
Standards • Formats which are recognized by a large variety of hardware and software programs • Possible to use on different platforms • Possible for different systems to share data
How Are Standards Developed? • People use it because it is the most popular - so-called, defacto standard. • Adobe’s PostScript • A committee is formed because a need for a standard is recognized • MPEG2 JPEG • MP3
Characteristics of Standards • A well designed data standard should: • simplify interconnections • usefully reflect the ways the data is used • be recognized by a wide variety of applications
Data Formats • Alphanumeric Character Data • EBCDIC • ASCII • Image data • Bit map images • Object images • Audio data • .MOD • .MIDI • .WAV
Alphanumeric CharactersASCII • American Standard Code for Information Interchange (pg. 67) • 7-bit code => 128 characters • Includes Latin alphabet, Arabic numerals, standard English punctuation characters, and a few non-printable characters • Collating sequence • Most personal computers use ASCII • Latin-I ASCII - 8-bit code for Western European cultures
Alphanumeric CharactersEBCDIC • Extended Binary Coded Decimal Interchange Code (pg. 68) • 8-bit code => 256 characters • Developed by IBM • Use is generally restricted to IBM and IBM-compatible mainframe computers • Limited: no ~ [ ] ^ representations
Alphanumeric CharactersUNICODE • Supports many additional character alphabets including Cyrillic, Greek, Hebrew, Arabic, Thai, Chinese, Japanese, Korean, etc. • 16-bit standard => 65,536 characters (49,000 of which are defined) • Code values of 0-255 corresponds to ASCII Latin-I codes => conversion from ASCII to Unicode is simple (just append 8 0’s to ASCII code) • Unicode is standard in Windows OS
Keyboard Conversion • A binary scan code is generated via the keyboard circuitry for each key stroke • Computer software converts scan code to Unicode or ASCII or EBCDIC • NOTE: This permits different keyboards to be used for different natural languages • Characters are echoed to monitor
Image Data • Bit Map (Raster) Image - image is represented by a set of picture elements (pixels - points) • Object (Vector) Image - image is represented by a set of graphical shapes such as lines and curves
Bit Map Images • A pixel, a single point of the image, is stored as binary data • Each pixel contains intensity level and color - both of which may have large ranges of values (requiring perhaps up to 3 bytes/pixel) • Single image requires large amounts of data - 1.5+ MB of data • Image processing requires large arrays of data - representing pixels and their locations
Graphics Interchange Format • GIF - most common method of storing bit map images • CompuServe proprietary format - 1987 • GIF89a - latest standard also supporting animated GIF images • GIF used extensively on Web • 8-bits/pixel - 256 colors (not good for details of paintings and photographs) • GIF is better suited for line drawings and simple images such as clip art, logos, and areas with solid colors
Joint Photographers Expert Group • JPEG suited to photographs and paintings. • 24-bits per pixel - 16.7 million colors • Lossy Compression - assumes some data can be lost without noticing • subtle color changes will not show • some clarity is lost in order to have a smaller file • very small file sizes
Object Images • Image is composed of geometric shapes represented mathematically • Efficient • Flexible • Stored in compact form • Images can be easily moved, scaled, rotated
PostScript • Page Description Language for storing, transmitting, displaying and printing object images. • Contains procedures and statements that describe each of the objects on a page. • Program is stored in ASCII or Unicode and thus is a text file. • Program in printer or computer interprets PostScript file and creates pages that can be printed or displayed. • Large set of PostScript functions for manipulating objects.
Video Images • Video images require massive amounts of data. • Video Camera: 640x480 resolution, true color, 30 frames/second => 28 MB/sec of data => 1.6GB for 1-minute of film • Solution to massive files: • limit number of colors • limit image size • reduce frame rate • compress data
Video Compression • MPEG - family of digital video compression standards • high compression achieved by storing only what changes from frame to frame • file sizes still large and take a long time to download • additional hardware support provided for displaying real-time video
Audio Data • Analog sound wave must be converted to digital form • Analog waveform is sampled at regular time intervals • Amplitude is measured and converted to the binary equivalent (A-to-D converter) • Sampling rate - how often the sound is digitized (50,000 times/sec) • Higher rates - better quality and more storage space
Audio Formats • MOD - store samples of sounds that are subsequently used to produce a new sound • MIDI - used to coordinate sounds and signals between computer and musical devices (such as keyboards) • MP3 - used for transmission and storage of high-quality audio signals. Popular for Web use. Low-cost, portable devices available for handling MP3 data. • WAV - simple Microsoft format supporting various sampling rates in mono or stereo
MP3 • Popular file format for compressing and playing CD-quality music • 12:1 compression with no loss in quality • Has caused much controversy in the music world • Procedure: encode a song, distribute over the Internet, download to a PC, transfer to a $200 portable MP3 player or listen on computer • Over 80 MP3 sites and 20,000 songs and growing
WAV Files • Some are found on your hard drive • Also can be downloaded from the Web • Create your own WAV file from an audio CD • Uses for Sound Files • to signify an event on the system (I.e. a file opening) • add sound to your web page
Streaming Audio and Video • Streaming - data is downloaded continuously from web server or network server • Solution to large file size download time - start the download and play while more is being downloaded in the background • RealAudio and RealVideo most common players - over 125 Million registered users • Delivers content to 85% of all streaming enabled web sites • Also used for live broadcasts
Data Compression • Lossless Compression (GIF Files) • no data is lost when decompressed • compression algorithms attempt to eliminate redundancies in data, such as a string of 0-bits • Lossy Compression (JPEG) • assume user can live with a certain amount of data degradation • used in multimedia applications