320 likes | 453 Views
An Introduction to XML. Based on the W3C XML Recommendations. Agenda. XML Syntax XML vs HTML Data Types – Elements, Attributes White Space – Optional, Mandatory, & Preserved Empty Content Valid vs Well Formed XML Schema Used to Validate XML data Before XML Schema – DTD’s
E N D
An Introduction to XML Based on the W3C XML Recommendations
Agenda • XML Syntax • XML vs HTML • Data Types – Elements, Attributes • White Space – Optional, Mandatory, & Preserved • Empty Content • Valid vs Well Formed • XML Schema • Used to Validate XML data • Before XML Schema – DTD’s • Simple Types vs Complex Types • Restricting data with Regular Expressions • Namespaces • Avoiding Tag Name Conflicts • XML Tools • XML Spy and Other Tools • Corresponding Sample XML, XSD, DTD, XSL, and XHTML files • XML Resources on the web • http://www.w3schools.com - an excellent site • http://www.xml.com • http://www.w3.org
XML vs HTML As you can see, XML looks similar to HTML. <?xml version="1.0“?> <root> <child> <subchild attribute=“metadata”>Data</subchild> </child> </root>
XML vs HTML • Unlike HMTL: • XML is Case Sensitive • Tags must be properly nested • All start tags must have a corresponding end tag to close the element • All XML documents must have a root element • Attrbutes must use quotes (can be single or double) • White space between tags is preserved
XML vs HTML • Special Characters • Handled the same way • For Example: • < > ‘ “ & • < > ' " &
Elements • XML Elements are extensible and they have parent/child relationships. • XML elements must follow these naming rules: • Names can contain letters, numbers, underscores, periods, colons, and hyphens (last three are not normally used in element names) • Names must not start with a number or punctuation character • Names must not start with the letters xml (or XML or Xml ) • Names cannot contain spaces
Attributes • Attributes are normally used to store metadata, data about data, and the real data is stored in elements between the start and end tags. • Single or Double quotes can be used.
White Space • White Space Includes: • Carrage returns, Line feeds, Spaces, Horizontal Tabs • Optional White Space • White space is optional in XML files • Mandatory White Space • White Space must occur when using attributes • Preserved White Space • Between start/end tag pairs
Valid <?xml version="1.0"?> < root > < child > <subchild>Data</subchild> </child> </root> Valid <?xml version="1.0"?><root><child><subchild>Data</subchild></child> </root> Optional White Space
Valid <?xml version="1.0"?> <root> <child attribute=“metadata”> </root> Invalid <?xml version="1.0"?> <root> <childattribute=“metadata”> </root> Mandatory White Space Must have white space here
Preserved White Space Valid <?xml version="1.0"?> <root> <child> <subchild>White space between start/end tag pairs will be preserved</subchild> </child> </root>
Empty Content • IF no data is “held” between a start/ end tag pair, two formats may be used: <tag></tag> <tag/> • The second format is called an Empty Tag (aka Null tag) and commonly used when only an attribute is needed: <tag attribute=“data”/>
Valid vs Well Formed • XML data is defined and validated most commonly by: • XML Schemas • DTD’s (Document Type Definition) • XML data is well formed if it follows the W3C XML Recommendation, Version 1.0 • This includes: • The start/end tags matching up • White space used properly • NOTE: XML Spy does both checks
XML Schema • Used to Define and Validate XML • In order for the XML file to be validated by a schema, the schema’s location is referenced as an attribute of the root element <FirmOrder … schemaLocation="http://www.telcordia.com/SGG/FO C:\13.0\Documentation\xsd\firmOrder.xsd"/>
XML Schema • Before XML Schema, most XML documents were validated against a DTD <!ELEMENT File (Record)*> <!ELEMENT Record (Fill)> <!ELEMENT Fill (#PCDATA)> <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.sample.org/xml" xmlns="http://www.sample.org/xml" elementFormDefault="qualified"> <xsd:element name=“File"> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Record" minOccurs=“0" maxOccurs=“unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Record"> <xsd:complexType> <xsd:sequence> <xsd:element ref=“Fill minOccurs=“1" maxOccurs=“1”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“Fill" type="xsd:string"/> </xsd:schema> XML Schema DTD Mercator Type Tree
XML Schema • Element Data Types • XML Schema’s Simple Types • Similar to Items in Mercator • 44 Simple types built-in XML Schema • XML Schema’s Complex Types • Similar to Groups in Mercator • 36 Complex types built-in XML Schema • XML Schema’s Attributes • Similar to the Properties of Items and Groups in Mercator
XML Schema • Simple Types can be restricted using Regular Expressions: <xsd:simpleType name="alphaString"> <xsd:restriction base="xsd:string"> <xsd:pattern value="([A-Z]|[a-z]|[ ])*"/> </xsd:restriction> </xsd:simpleType>
Namespaces • XML Namespaces provide a method to avoid element name conflicts. • Since element names are not predefined as in HTML, often times a name conflict can occur when combining two different documents using the same name for two different elements
Namespaces, cont. • If the following two XML documents were added together, there would be an element name conflict because both documents contain a <table> element with different content and definition. <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table> <table> <name>Tea Table</name> <width>80</width> <length>120</length> </table>
Solving Name Conflicts using a Prefix <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
Using Namespaces <h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>
Namespaces • URI’s are used as the namespace name • Most commonly used URI is a URL • URL’s by definition are unique to companies • The URL does NOT need to be valid • They are used for creating uniqueness not validating your tags • Most companies put “help” documentation about their namespace, tags, and/or XML Schemas
XML Samples • The next five slides have different types of XML files that correspond to each other: • XML Data Document • XML Schema • DTD (these are not written in XML) • XSL – style sheet
XML Data Sample <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="xmlxsl.xsl"?> <root> <child> <name>Optional name tag used in this child tag</name> <description>First description start/end tag pair in child tag</description> </child> <child> <name>Optional name tag used in this child tag</name> <description>First description start/end tag pair in child tag</description> <description>Second description start/end tag pair in child tag</description> </child> </root>
XML Schema Sample <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.sample.org/xml" xmlns="http://www.sample.org/xml" elementFormDefault="qualified"> <xsd:element name=“root"> <xsd:complexType> <xsd:sequence> <xsd:element ref=“child" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="child"> <xsd:complexType> <xsd:sequence> <xsd:element ref="name" minOccurs ="0"/> <xsd:element ref="description" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="name" type="xsd:string"/> <xsd:element name="description" type="xsd:string"/> </xsd:schema>
XML DTD Sample <!ELEMENT root (child)*> <!ELEMENT child (name?, description+)> <!ELEMENT name (#PCDATA)> <!ELEMENT description (#PCDATA)>
Style Sheet (XSL) Sample <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <body> <h2>XHTML Sample</h2> <table border="1"> <tr bgcolor="gray"> <th>Name</th> <th>Description</th> </tr> <xsl:for-each select="root/child"> <tr> <td><xsl:value-of select="name" /></td> <td><xsl:value-of select="description" /></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
XHTML Generated <html> <body> <h2>XHTML Sample</h2> <table border="1"> <tr bgcolor="gray"> <th>Name</th> <th>Description</th> </tr> <tr> <td>Optional name tag used in this child tag</td> <td>First description start/end tag pair in child tag</td> </tr> <tr> <td>Optional name tag used in this child tag</td> <td>First description start/end tag pair in child tag</td> </tr> </table> </body> </html>
XML Spy • Accomplishes several XML tasks including: • Editing a variety of XML data graphically • Allowing multiple views including: • Text, browser, grid, structure (schema design), • Creates test data from XML Schemas • Generates XML Schemas from XML files • Validates data • Checks data for Well-formedness
Other Tools • Internet Explorer • Displays XML data using a default style sheet • Checks XML for Well-formedness and displays error message for troubleshooting • UltraEdit
XML Resources on the web • They are hundreds of XML resources on the web. • http://www.w3schools.com (an excellent site) • http://www.xml.com • http://www.w3.org • The easiest was to find data about a specific XML topic or syntax is to type it into google.com
Contact Us Barry DeBruin debruinconsulting.com 919-434-5399