251 likes | 536 Views
Physical, Logical, Conceptual. DSA Lecture 2 2006-7. Abstraction Layers. WHAT. Conceptual What data is held An Image and its meta-data Entity-Relationship model (ERM) Logical How data is organised in storage Block and Directory structure Tables, keys Physical
E N D
Physical, Logical, Conceptual DSA Lecture 2 2006-7
Abstraction Layers WHAT • Conceptual • What data is held • An Image and its meta-data • Entity-Relationship model (ERM) • Logical • How data is organised in storage • Block and Directory structure • Tables, keys • Physical • How data is stored in bits • JPEG as a stream of bytes • A Database as files and records stored in a DBMS-specific format Realisation (Refinement Reification) (Engineering, Model-Driven development Abstraction (Reverse Engineering) HOW
On Flickr API: Image Data More..
FireFox 1.5.0.7 IE 6.0.2800.1106
JPEG and Exif • This image uncompressed would be • 1520 x 2032 x 3 bytes = 9 265 920 • 9 048.75 • File size is only 342kb • Compression ratio 0.0377952756 • JPEG achieves high compression • Lossy – cannot recover the full raw data • Level of compression is variable • JPEG • Joint Photographic Experts Group developed the standard for compression. • Defines how to create a byte stream for the image alone. • Wikipedia entry describes the algorithm. • JPEG is a BSI standard BS ISO/IEC 15444-6:2003 – we have license for these. • JPEG/JFIF • Defines how these bytes are packaged up unto a file for transmission. • This combines a JPEG compressed image (the ‘data’) with ‘meta-data’ (data about the image) in TIFF format (Tagged Image File Format) • Exif defines a set of tags and coded values to define this meta-data • Specification is 154 pages
Meta data • “Data about Data” • The image is the Data • The make of the camera is metadata • The dangers • Thumbnails not updated with the main image • Date/time, location information
Understanding the Physical Layer • Description of the implementation • Standard manual • Informal explanation • Hex viewer or editor • XVI32 (free) • 010 editor (30 day free, $49.95 US license) • Issues • How numbers are stored • How characters are stored • How strings are stored • How data is identified • How data is grouped • ..
Hexadecimal • Hex – 6 in Latin – Hexagon • Deca – 10 in Latin – Decimal • Hexadecimal • Base 16 - 4 bits • Digits 0 – F (0 -15)
Hexadecimal - Decimal • Nibble • 4 bits • 1 hex character • Byte • 8 bits • 2 nibbles • 2 hex characters • D8 • 13 x 16 + 8 = ? • 3C FA • 3 12 15 10 • 3 x 16^3 + 12 x 16 ^2 + 15 x 16 + 10 • ((3 x 16 +12) x 16 + 15) x 16 +10 = ?
All JPEGs start with these bytes Marker : Start of Block Length of Block ‘Exif’ header Start of Data Intel Byte Order Start of IFD Tag No - 42 Offset to IFD (Image File Directory): 8 bytes
Big-endian / Little-endian • Big-endian • Bytes in the order from most significant to least significant • 3C FA = 3C x 256 + FA = 15 610 • Motorola • coded MM in Exif • Little-endian • Bytes in the order from least significant to most significant • 2A 00 = 00 x 256 + 2A = 32 • Intel processors • coded II in Exif • Endianness • Affects addresses and dates • UK addresses are little-endian, Japanese big-endian • Email addresses little-endian, File paths big-endian • IP addresses ?
No of directory entries (11) First entry Tag 010F ‘Make’ Data Type (string) String Length (24) CAMERA ‘Make’ Value (24 bytes) Relative Offset = 92 Absolute Position =Offset + start(0C) spaces Null (end of C string)
Exercise • Each directory entry is 12 bytes long • Entry Type is coded – see table • Tags are coded • Decode 3 entries - distributed around the class
Workshop • Flickr Review • Review approach to research – see the blog for ideas • Get into groups • Physical layer work • The XVI32 editor can be downloaded from the autor’s site or run from the J: drive in J:/xvi • Find a Jpeg and identify some items of metadata • The lecture example is linked here • Find a MP3 file – it probably contains ID3 metadata – see the ID3 site for documentation – and identify some metadata. • Why don’t ID3 and Exif use the same logical structure using directories?