110 likes | 199 Views
Content Types: Markup and Multimedia. Introduction. Markup languages use extra textual syntax to encode: Formatting / display information Structure information Descriptive metadata Semantic metadata Marks are often called tags The act of adding markup is called tagging
E N D
Introduction • Markup languages use extra textual syntax to encode: • Formatting / display information • Structure information • Descriptive metadata • Semantic metadata • Marks are often called tags • The act of adding markup is called tagging • Most markup languages use initial and ending tags surrounding the marked text
Standard Generalized Markup Language (SGML) • Metalanguage for markup. • Includes rules for defining markup language • Use of SGML includes • Description of structure of markup • Text marked with tags • Document Type Declaration (DTD) • Describes and names tags and how they are related • Comments used to express interpretation of tags (meaning, presentation, …)
SGML DTD Example • <!– SGML DTD for electronic messages - - > • <! ELEMENT e-mail - - (prolog, contents) > • <! ELEMENT prolog - - (sender, address+ , subject?, Cc*) > • <! ELEMENT (sender | address | subject | Cc) - 0 (#PCDATA) > • <! ELEMENT contents - - (par | image | audio)+ > • <! ELEMENT par - 0 (ref | #PCDATA)+> • <! ELEMENT ref - 0 EMPTY > • <! ELEMENT (image | audio) - - (#NDATA) > • <! ATTLIST e-mail • id ID #REQUIRED • date_sent DATE #REQUIRED • status (secret | public ) public > • <! ATTLIST ref • id IDREF #REQUIRED > • <! ATTLIST (image | audio) • id IDREF #REQUIRED >
SGML Example • <!– DOCTYPE e-mail SYSTEM “e-mail.dtd”> • <e-mail id=94108rby date_sent=02101998> • <prolog> • <sender> Pablo Neruda</sender> • <address> Federico Garcia Lorca</address> • <address> Ernest Hemingway</address> • <subject> Picture of my house in Isla • <Cc> Gabriel Garcia Marquez</Cc> • </prolog> • <contents> • <par> • Here are two photos. One is of the view (photo <ref idref=F2>). • </par> • <image id=F1> “photo1.gif” </image> • <image id=F2> “photo2.jpg” </image> • </contents> • </e-mail>
SGML Characteristics • DTD provides ability to determine if a given document is well-formed. • SGML generally does not specify presentation/appearance. • Output specification standards: • DSSSL (Document Style Semantic Specification Language) • FOSI (Formatted Output Specification Instance)
HyperText Markup Language (HTML) • Based on SGML • HTML DTD not explicitly referenced by documents • HTML documents can have documents embedded within them • Images or audio • Frames with other HTML documents • When programs are included, it is referred to as Dynamic HTML • Strict HTML includes only non-presentational markup. • Cascade Style Sheets (CSS) used to define presentation • In reality, presentational and structural markup are blended by HTML authoring applications.
HTML Limitations • In contrast to SGML: • Users cannot specify their own tags or attributes. • No support for nested structures that can represent database schemas or object-oriented hierarchies. • No support for validation of document by consuming applications.
eXtensible Markup Language (XML) • XML is a simplified subset of SGML • XML is a meta-language • XML designed for semantic markup that is both human and machine readable • No DTD is required • All tags must be closed • Extensible Style sheet Language (XSL) • XML equivalent of CSS • Can be used to convert XML into HTML and CSS
Multimedia • Lots of data file formats for non-textual data • Images • BMP, GIF, JPEG (JPG), TIFF • Audio • AU, MIDI, WAVE, MP3 • Video • MPEG, AVI, QuickTime • Graphics / Virtual Environments • CGM, VRML, OpenGL
Audio and Video • Data files often have: • Header • Indicates time granularity, number of channels, bits per channel • Somewhat like a DTD • Data • The signal • Data may be compressed • Data may be in frequency domain rather than time domain • Data may be encoded as sequence of differences between consecutive time segments.