120 likes | 219 Views
Introducing XML : Table of Contents. 1. From HTML to XML 2. Well-Formed XML 3. Validity / DTDs 4. Encodings 5. XML Namespaces 6. XML Schema 7. XML Tools 8. XML APIs / SAX 9. XML APIs / DOM 10. Stylesheets : CSS & XSL 11. XML Query Language (XQL). 1. From HTML to XML. <HTML>
E N D
Introducing XML : Table of Contents 1. From HTML to XML 2. Well-Formed XML 3. Validity / DTDs 4. Encodings 5. XML Namespaces 6. XML Schema 7. XML Tools 8. XML APIs / SAX 9. XML APIs / DOM 10. Stylesheets : CSS & XSL 11. XML Query Language (XQL)
1. From HTML to XML <HTML> <HEAD><TITLE>Drei Sonaten und drei Partiten für Violine solo</TITLE></HEAD> <BODY> <H1>Drei Sonaten und drei Partiten für Violine solo</H1> <P>Publisher : Bärenreiter</P> <P>Composer : J. S. Bach</P> … <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE music SYSTEM "../DTDs/music.dtd"> <music ismn="M-006-46489-0" type="concert"> <title>Drei Sonaten und drei Partiten für Violine solo</title> <publisher>Bärenreiter</publisher> <composer>J. S. Bach</composer> ...
2. Well-Formed XML An XML document is called well-formed if 1. It starts with the XML prolog, i.e. <?xml version=“1.0”?> ... 2. The tags are properly nested, 3. There is exactly one root element, ... The XML spec : http://www.w3.org/TR/REC-xml
3. Validity / DTDs An XML document is valid, if it has an associated document type declaration and if the document complies with the constraints expressed in it. <!-- DTD for Music --> <!ELEMENT music (title, publisher, composer?, opus?, remarks?, instruments?, pieces)> <!ELEMENT title (#PCDATA)> <!ELEMENT publisher (#PCDATA)> <!ELEMENT composer (#PCDATA)> <!ATTLIST music ismn CDATA #IMPLIED ... A Gentle Introduction to SGML : http://www-tei.uic.edu/orgs/tei/sgml/teip3sg/
4. Encodings • <?xml version="1.0" encoding="ISO-8859-1"?> • <!DOCTYPE music SYSTEM "../DTDs/music.dtd"> • <music ismn="M-006-46489-0" type="concert"> • <title>Drei Sonaten und drei Partiten für Violine solo</title> • <publisher>Bärenreiter</publisher> • … • ISO-8859-1, western european, one byte per character, superset of US-ASCII • UTF-8, at least one byte per character, superset of US-ASCII • UTF-16, two bytes per character, endian problem • UTF-32, four bytes per character, endian problem Unicode Consortium Homepage: http://www.unicode.org/
5. XML Namespaces XML namespaces provide a simple method for qualifying element and attribute names used in XML documents by associating them with namespaces identified by URI references. <library xmlns:m=“http://www.somewhere.com/”/> <m:music><m:title>… <m:music><m:title>… ... </library> W3C Rec., Namespaces in XML : http://www.w3.org/TR/REC-xml-names/
6. XML Schema <?xml version='1.0'?> <schema name='music' version='1.0'> <elementType name='music'> <sequence> <elementTypeRef name='title'/> <elementTypeRef name='publisher'/> <elementTypeRef name='composer' minOccur="0" maxOccur="1"/> … <attrDecl name='ismn'> <datatypeRef name='string'/> </attrDecl> ... XML Schema Part 1: Structures : http://www.w3.org/TR/xmlschema-1/ XML Schema Part 2: Datatypes : http://www.w3.org/TR/xmlschema-2/
7. XML Tools XML Parsers expat, James Clark, C, http://www.jclark.com/xml/expat.html XP, James Clark, Java, http://www.jclark.com/xml/xp/ IE5 XJParser, DataChannel, http://xdev.datachannel.com/downloads/xjparser/ XML Editors Notepad XMetaL, SoftQuad, http://www.softquad.com/products/xmetal/index.html Adept, Arbortext, http://www.arbortext.com/Products/ ADEPT_Series/adept_series.html
8. XML APIs / SAX SAX 1.0: a free API for event-based XML parsing import org.xml.sax.Parser; import org.xml.sax.DocumentHandler; import org.xml.sax.helpers.ParserFactory; Parser parser = ParserFactory.makeParser("com.microstar.xml.SAXDriver"); DocumentHandler handler = new MyHandler(); parser.setDocumentHandler(handler); parser.parse("http://pcjhb.software-ag.de/xml/closedXml/instances/m1.xml"); ... import org.xml.sax.HandlerBase; import org.xml.sax.AttributeList; public class MyHandler extends HandlerBase { public void startElement (String name, AttributeList atts) { ... SAX Homepage : http://www.megginson.com/SAX/
9. XML APIs / DOM A platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of XML and HTML documents. import org.w3c.dom.*; import com.docuverse.dom.*; DOM dom = new com.docuverse.dom.DOM(); ... doc = dom.openDocument(url); BasicElement rootNode = (BasicElement) doc.getDocumentElement(); NodeList list = rootNode.getChildNodes(); Integer.toString(list.getLength()) + " Children"); ... DOM Level 1 Specification : http://www.w3.org/TR/REC-DOM-Level-1/
10. Stylesheets <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href="music.xsl"?> <!DOCTYPE music SYSTEM "music.dtd"> <music ismn="M-006-46489-0" type="concert"> … <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> ... <xsl:template match="music"> <xsl:apply-templates select="opus"/> ... XSL spec : http://www.w3.org/TR/WD-xsl/ The W3C’s CSS Homepage : http://www.w3.org/Style/CSS/ DSSSL Page at Mulberry Technologies : http://www.mulberrytech.com/dsssl/index.html
11. XML Query Language (XQL) The XML Query Language (XQL) is a query language for XML using the structure of XML as its data model. <pieces> <piece> <title>Sonata I</title> <opus>BWV 1001</opus> <movements> <movement><title>Adagio</title></movement> <movement><title>Fuga Allegro</title></movement> ... //piece//movement[title=“Adagio”] A W3C Proposal : http://www.w3.org/TandS/QL/QL98/pp/xql.html