130 likes | 330 Views
XML. eXtensible Markup Language. XML. A method of defining a format for exchanging documents and data. Allows one to define a dialect of XML A library of tags, with associated structure. <config> <descriptor type="FILE" name="source"> <attribute name="media_type" type="svalue"/>
E N D
XML eXtensible Markup Language
XML • A method of defining a format for exchanging documents and data. • Allows one to define a dialect of XML • A library of tags, with associated structure <config> <descriptor type="FILE" name="source"> <attribute name="media_type" type="svalue"/> <attribute name="frame_rate" type="svalue"/> </descriptor> </config>
The Social Benefits • Can specify an interchange format concisely and accurately enough to set up a validation service easily • There is plenty of available software for dealing with XML files and translating from one format into another
Downsides • Sometimes defining a representation can be a pain • Deciding what to leave as content and what to move to attributes. • XML Schemas are confusing, while DTDs do not offer enough control • Verbose • ViPER increased about 2x uncompressed, 4/3x gzip compressed • Difficult to read • Lots of </…> and end tags get in the way of the data
The Real Benefits to The Programmer • XML Schema (or DTDs) allow you to validate a document without having to examine it • Xpath allows you to specify a node, or set of nodes, in a document quickly and easily • SAX makes it easy to write a quick parser • DOM makes it so you don’t even have to do that • XSL:T allows you to transform from an XML document into another document, possibly not even standard XML • Etc.
XML As A File Format • Makes parsing simpler, but currently no methods for making saving easier • Saves you from dealing with things like character encoding and date formatting • No more difficult than making up your own • An unfamiliar or forgotten file grants more affordances than an XML or binary file
Defining A Dialect • XML Schema – Structure and Data • Define elements and attributes • Associate them with data types <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://lamp.cfar.umd.edu/viper" xmlns:viper="http://lamp.cfar.umd.edu/viper" elementFormDefault="qualified"> <xsd:element name="viper"/> <xsd:element name="config"/> </xsd:schema>
Schema Datatypes • Can create and assign datatypes to attributes and elements. For example: <xsd:element name="data" type="xsd:base64Binary"/> <xsd:attribute name="span" type="viper:framespanType"/> <xsd:simpleType name="framespanType"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\d+\:\d+" /> </xsd:restriction> </xsd:simpleType>
Schema Structures • Can specify order and contents of elements • Sequence, choice, mixed, etc. allow specifying how and where elements appear • Substitution groups allow one tag to take the place of another • Can group elements without placing the into types
Extensiblity • Inheritance • Can extend complex elements by adding more attributes and elements to the bottom • Can restrict the data using the <restriction/> elements • The <any/> and <anyAttribute/> elements • The ultimate in extensibility, allow any valid XML in from a given namespace or range of namespaces
Parsing • Using the DOM: • The DOM provides a tree structure that represents the document • Memory heavy • Using SAX: • Event driven • Lightweight • Better for large documents
Xpath • The common language for selecting individual pieces of an XML document shared between X-Link and XSL:T • Also used for defining uniqueness constraints in Schemas • DOM Level 3 will support selecting by Xpath • Looks sort of like a JavaScript DOM call: • /viper/config/descriptor[@type=“FILE”] • Selects all of the file descriptor nodes that are of type “FILE”
Resources • www.xml.com • O'Reilly's XML resource • www.w3.org • The standards themselves, and lots of good links to implementations. • xml.apache.org • DOM, SAX, and XSLT for C and Java