1 / 13

Data Representation

Data Representation. CPIS 210 John Beckett. Useful Skills. Understand how data is organized Helps development & debugging Helps understand performance issues for planning & design Understand how to transform data Is it related to other data? Is there a transliteration?

keran
Download Presentation

Data Representation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Representation CPIS 210 John Beckett

  2. Useful Skills • Understand how data is organized • Helps development & debugging • Helps understand performance issues for planning & design • Understand how to transform data • Is it related to other data? • Is there a transliteration? • What is truncated or expanded in the process • Understand how to quantify performance

  3. A Brief History • Pictograph – represents ideas by pictures • Alphabet – represents phonemes (sounds) by letters • Morse Code – Represents characters by strings of on/off states over time • Electrical Analog – represents • variations in physical reality with • Variations in voltage • Audio: air pressure over time • Video: locations on a raster New media tend not to replace the old, but to encapsulate it. E.g. Movies on TV

  4. Digital Representation • Everything must be funneled into 1’s and 0’s • If the information was discrete symbols (e.g. the alphabet) or events, there is a code list • Hollerith, BCDIC, EBCDIC – IBM • Baudot – 5 bits, required shift for num/char • ASCII – 7 bits • Unicode – varying width, extends ASCII • If the information represents physical state, there is a standard (e.g. WAV, MP3)

  5. Conversions • Binary to Hex: Group bits by 4, then map them to the hex sequence: • 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F • Hex to Binary: Turn each digit into 4 bits

  6. IPv4 Addresses • Four bytes, each of which contains 8 bits • 8 bits can contain any positive number from 0-255

  7. Masking • If two numbers are combined with “AND” operator, a “1” bit will result where both bits were “1” but a “0” bit otherwise • This is how IP Net Masks work • If two numbers are combined with “OR” operator, a “1” bit will result where either bit was “1”, and an “0” bit will only result if both were “0”.

  8. Masking • If we know that a certain bit (let’s say, 8) indicates whether a charge transaction was accepted, we can use an “OR” mask to access only the relevant bit. Example in php: $data=128+16+8+1; $mask=8; print $data . "<br />"; $result = $data && $mask; print $mask . "<br />"; Result: 1538 Result: 1450

  9. Representing Multiple Bits • Parallel: Separate in space • Issue: Synchronizing multiple lines • Issue: Multiplies cost of transmission • Serial: Separate in time • Issue: Aggregate speed may be very high • Current practice is gravitating toward serial in all domains except shortest/fastest (e.g. video display)

  10. Re-Coding Methods • If code size is the same and character set is similar, use a lookup table (e.g. EBCDIC to ASCII) • Video and audio re-coding can be particularly complex. A preliminary or intermediate format: • Preserves all nuances of every representation • Probably native to the equipment you use • Probably consumes a great deal of data space • Example: “raw” format in digital cameras

  11. Communicating Data • Issue: How do you synchronize? • Async: lost time in high-volume situations • Sync: Must keep the channel active (special “SYN” pattern that can be re-captured easily) • Issue: Which bit goes first? • RS-232 sends low-order bit first • Issue: Routing versus content? • TCP/IP organizes in terms of “packets” which include both types of information

  12. Close to the Metal • As we develop more-complex data management schemes, we use up more computer power • Need extreme speed? • Simplify the requirements • Use available speed for performance, not ease-of-use and management features

  13. The Challenge Continues • Was: How to get data into the computer • Then: How to transfer data to new system • Now: How to establish live (or more lively) links easily • One solution: XML • Provides for structure • Separate schema may be used

More Related