140 likes | 273 Views
Extensible Markup Language (XML). CS422 Dick Steflik. What is XML. A Markup Language for giving a text document contextual structure parentage is Standard Generalized Markup Language (SGML; ISO 8879) specify a documents structure and attributes, not processing should ne declarative
E N D
Extensible MarkupLanguage (XML) CS422 Dick Steflik
What is XML • A Markup Language for giving a text document contextual structure • parentage is Standard Generalized Markup Language (SGML; ISO 8879) • specify a documents structure and attributes, not processing • should ne declarative • a set of rules for encoding documents that is both human and machine readable
Things to note in the example • Every tag is paired with an ending tag • end tags have same name preceded with "/" • tag pairs constitute xml entities • Tag are in lower case by convention (XML doesn't care about case) • Documents must be "well formed" • tags may be nested one inside of another (never cross matched) • every opening tag must have a closing tag
Tag Attributes • Every tag may have a set of attributes • specified as part of the tag as either • name/value pairs (ex. id="abc") • keywords ( ex. noform) • attributes specify additional information about the tag • attributes are seperated by one or more spaces • commas will generate errors
Attribute example <message to=you@yourAddress.com" from=me@myAddress.com> <subject>Another XML Example</subject> <text> This is the message body. </text> </message>
XML Prolog • XML files always start out with a prolog line • <?xml version="1.0"> • other attributes: • encoding – identifies the character set used to encode the data • standalone – identifies that the document stands alone i.e doesn't require any external references.
Example <?xml version="1.0" encoding="ISO8859-1" standalone="yes"> <message <to>you@youraddress.com</to> <from>me@myaddress.com</from> <subject>Another XML Example</subject> <text> This is the message body. </text> </message>
Comments in XML files • <!--- ->
Processing Instructions • Since XML is a portable document format, the same document may be processed by a number of applications, this processing can be specified in the file • Each instruction should be of the form: • <?target instructions ?> • target – the name of the processing application • instruction – a string of characters that specify the processing commands or parameters
Why is XML Important? • Not a binary format • can be transtorted accross a network easily • easy to create manually or programatically • makes debugging easier • can describe very complex objects • easy to store in a database • more scalable than binary
Data Identification • Since the tags describe the structure of the data, it makes the same data more usable by multiple applications • looking at the previous example: • it is easily searchable by a search program • easily displayable by a viewer • easy to store in a database
Stylability • For applications where rendering is important (word processors, browsers, publishing) use Extensible Stylesheet Language (XSL) • XSLT • XSL-FO • XPATH
Inline reusability • Unlike HTML, XML documents can include other inline documents • this allows the construction of very complex objects from: • other simpler objects • other hosts
XML Parsers • To make the data from an XML document useful it must be parsed out of the document. This can be easily done two ways • SAX (Simple API for XML) • java api that parses xml and retrieves the data as the tags are encountered • DOM (Document Object Model) • as an xml or xhtml document is loaded into the browser it is parsed into a document tree and then via javascript made available for processing • More on DOM and SAX later in the course