1.51k likes | 1.66k Views
Advanced XML and Web Services. September 12, 2006 Robert Richards rrichards@php.net http://www.cdatazone.org/files/workshop.zip. Agenda. Introduction to Terms and Concepts Libxml DOM SimpleXML SAX (ext/xml) XMLReader XSL XMLWriter SOAP (ext/soap). XML Namespaces.
E N D
Advanced XML and Web Services September 12, 2006 Robert Richards rrichards@php.net http://www.cdatazone.org/files/workshop.zip
Agenda • Introduction to Terms and Concepts • Libxml • DOM • SimpleXML • SAX (ext/xml) • XMLReader • XSL • XMLWriter • SOAP (ext/soap)
XML Namespaces • An XML Namespace is a collection of names identified by a URI. • They are applicable to elements and attributes. • Namespaces may or may not be associated with a prefix. • xmlns:rob="urn:rob" • xmlns=http://www.example.com/rob • Attributes never reside within a default namespace. • It is illegal to have two attributes with the same localname and same namespace on the same element.
XML Namespace Example <order num="1001"> <shipping> <name type="care_of">John Smith</name> <address>123 Here</address> </shipping> <billing> <name type="legal">Jane Doe</name> <address>456 Somewhere else</address> </billing> </order>
XML Namespace Example <order num="1001" xmlns="urn:order" xmlns:ship="urn:shipping" xmlns:bill="urn:billing"> <ship:shipping> <ship:name type="care_of">John Smith</ship:name> <ship:address>123 Here</ship:address> </ship:shipping> <bill:billing> <bill:name type="legal">Jane Doe</bill:name> <bill:address>456 Somewhere else</bill:address> </bill:billing> </order>
Illegal Namespace Usage <order num="1001" xmlns="urn:order" xmlns:order="urn:order" xmlns:ship="urn:order"> <shipping ship:type="fed_ex" type="fed_ex"> <name ship:type="care_of" order:type="legal">John Smith</ship:name> </ship:shipping> </order>
Illegal Namespace Usage <order num="1001" xmlns="urn:order" xmlns:order="urn:order" xmlns:ship="urn:order"> <shipping ship:type="fed_ex" type="fed_ex"> <name ship:type="care_of" order:type="legal">John Smith</ship:name> </ship:shipping> </order> <!-- attributes on shipping element are valid ! -->
Reserved Namespaces and Prefixes • The prefix xml is bound to http://www.w3.org/XML/1998/namespace. • The prefix xmlns is bound to http://www.w3.org/2000/xmlns/. • Prefixes should also not begin with the characters xml.
Schemas and Validation • Validation insures an XML document conforms to a set of defined rules. • Multiple mechanisms exist to write document rule sets: • Document Type Definition (DTD) • XML Schema • RelaxNG
Document Type Definition (DTD)validation/courses-dtd.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE courses [ <!ELEMENT courses (course+)> <!ELEMENT course (title, description, credits, lastmodified)> <!ATTLIST course cid ID #REQUIRED> <!ELEMENT title (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT credits (#PCDATA)> <!ELEMENT lastmodified (#PCDATA)> ]> <courses> <course cid="c1"> <title>Basic Languages</title> <description>Introduction to Languages</description> <credits>1.5</credits> <lastmodified>2004-09-01T11:13:01</lastmodified> </course> <course cid="c2"> . . . </course> </courses>
DTD and IDsvalidation/course-id.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE courses [ <!ATTLIST course cid ID #REQUIRED> ]> <courses> <course cid="c1"> <title xml:id="t1">Basic Languages</title> <description>Introduction to Languages</description> </course> <course cid="c2"> <title xml:id="t3">French I</title> <description>Introduction to French</description> </course> <course cid="c3"> <title xml:id="t3">French II</title> <description>Intermediate French</description> </course> </courses>
XML Schemavalidation/course.xsd <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="courses"> <xsd:complexType> <xsd:sequence> <xsd:element name="course" minOccurs="1" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="title" type="xsd:string"/> <xsd:element name="description" type="xsd:string"/> <xsd:element name="credits" type="xsd:decimal"/> <xsd:element name="lastmodified" type="xsd:dateTime"/> </xsd:sequence> <xsd:attribute name="cid" type="xsd:ID"/> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
RelaxNGvalidation/course.rng <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <element name="courses"> <zeroOrMore> <element name="course"> <attribute name="cid"><data type="ID"/></attribute> <element name="title"><text/></element> <element name="description"><text/></element> <element name="credits"><data type="decimal"/></element> <element name="lastmodified"><data type="dateTime"/></element> </element> </zeroOrMore> </element> </start> </grammar>
XPath • Language to locate and retrieve information from an XML document • A foundation for XSLT • An XML document is a tree containing nodes • The XML document is the root node • Locations are addressable similar to the syntax for a filesystem
XPath Reference Documentxpath/courses.xml <courses xmlns:t="http://www.example.com/title"> <course xml:id="c1"> <t:title>Basic Languages</t:title> <description>Introduction to Languages</description> </course> <course xml:id="c2"> <t:title>French I</t:title> <description>Introduction to French</description> </course> <course xml:id="c3"> <t:title>French II</t:title> <description>Intermediate French</description> <pre-requisite cref="c2" /> <?phpx A PI Node ?> <defns xmlns="urn:default">content</defns> </course> </courses>
XPath Location Examplexpath/location.php Expression: /courses/course/description //description /courses/*/description //description[ancestor::course] Resulting Nodset: <description>Introduction to Languages</description> <description>Introduction to French</description> <description>Intermediate French</description>
XPath Function Examplexpath/function.php string(/courses/course/pre-requisite[@cref="c2"]/..) French II Intermediate French content
XPath and Namespacesxpath/namespaces.php //title Empty NodeSet //t:title <t:title>Basic Languages</t:title> <t:title>French I</t:title> <t:title>French II</t:title> //defns Empty NodeSet //*[local-name()="defns"] <defns xmlns="urn:default">content</defns>
PHP and XML • PHP 5 introduced numerous interfaces for working with XML • The libxml2 library (http://www.xmlsoft.org/) was chosen to provide XML support • The sister library libxslt provides XSLT support • I/O is handled via PHP streams
XML Entensions for PHP 5 • ext/libxml • ext/xml (SAX push parser) • ext/dom • ext/simplexml • ext/xmlreader (pull parser) • ext/xmlwriter • ext/xsl • ext/wddx • ext/soap
Libxml • Contains common functionality shared across extensions. • Defines constants to modify parse time behavior. • Provides access to streams context. • Allows modification of error handling behavior for XML based extensions.
Libxml: Error Handling bool libxml_use_internal_errors ([bool use_errors]) void libxml_clear_errors ( void ) LibXMLError libxml_get_last_error ( void ) array libxml_get_errors ( void )
Libxml: LibXMLError Class: LibXMLError Properties (Read-Only): (int) level (int) code (int) column (string) message (string) file (int) line LibXMLError::code Values: LIBXML_ERR_NONE LIBXML_ERR_WARNING LIBXML_ERR_ERROR LIBXML_ERR_FATAL
LibXMLError Examplelibxml/error.php <?php /* Regular Error Handling */ $dom = new DOMDocument(); $dom->loadXML('<root>'); /* New Error Handling */ libxml_use_internal_errors(TRUE); if (! $dom->loadXML('root')) { $arrError = libxml_get_errors(); foreach ($arrError AS $xmlError) { var_dump($xmlError); } } else { print "Document Loaded"; } ?>
LibXMLError Result PHP Warning: DOMDocument::loadXML(): Premature end of data in tag root line 1 in Entity, line: 1 in /home/rrichards/workshop/libxml/error.php on line 4 Warning: DOMDocument::loadXML(): Premature end of data in tag root line 1 in Entity, line: 1 in /home/rrichards/workshop/libxml/error.php on line 4 New Error Handling: object(LibXMLError)#2 (6) { ["level"]=> int(3) ["code"]=> int(4) ["column"]=> int(1) ["message"]=> string(34) "Start tag expected, '<' not found" ["file"]=> string(0) "" ["line"]=> int(1) }
DOM • Tree based parser • Allows for creation and editing of XML documents • W3C Specification with DOM Level 2/3 compliancy • Provides XPath support • Provides XInclude Support • Ability to work with HTML documents • Zero copy interoperability with SimpleXML • Replacement for ext/domxml from PHP 4
DOMDocument DOMElement DOMAttr DOMComment DOMDocumentType DOMNotation DOMEntity DOMEntityReference DOMProcessingInstruction DOMNameSpaceNode DOMDocumentFragment DOMCharacterData DOMText DOMCdataSection DOMNode Classes
Additional DOM Classes • DOMException • DOMImplementation • DOMNodeList • DOMNamedNodeMap • DOMXPath
DOM: Sample Document <courses> <course cid="c1"> <title>Basic Languages</title> <description>Introduction to Languages</description> <credits>1.5</credits> <lastmodified>2004-09-01T11:13:01</lastmodified> </course> <course cid="c2"> <title>French I</title> <description>Introduction to French</description> <credits>3.0</credits> <lastmodified>2005-06-01T14:21:37</lastmodified> </course> <course cid="c3"> <title>French II</title> <description>Intermediate French</description> <credits>3.0</credits> <lastmodified>2005-03-12T15:45:44</lastmodified> </course> </courses>
DOM: Document Navigationdom/navigate.php /* Find first description element in subtrees */ function locateDescription($nodeset) { foreach ($nodeset AS $node) { if ($node->nodeType == XML_ELEMENT_NODE && $node->nodeName == 'description') { $GLOBALS['arNodeSet'][] = $node; return; } if ($node->hasChildNodes()) { locateDescription($node->childNodes); } } } $dom = new DOMDocument(); $dom->load('course.xml'); $root = $dom->documentElement; $arNodeSet = array(); if ($root->hasChildNodes()) { locateDescription($root->childNodes); } foreach ($arNodeSet AS $key=>$node) { print "#$key: ".$node->nodeValue."\n"; }
DOM: Document Navigation Results #0: Introduction to Languages #1: Introduction to French #2: Intermediate French
DOM:Document Navigation #2dom/navigate-2.php <?php $dom = new DOMDocument(); $dom->load('course.xml'); $nodelist = $dom->getElementsByTagName('description'); foreach ($nodelist AS $key=>$node) { print "#$key: ".$node->nodeValue."\n"; } ?> Results: #0: Introduction to Languages #1: Introduction to French #2: Intermediate French
DOM: Navigation Optimizeddom/navigate-optimized.php function locateDescription($node) { while($node) { if ($node->nodeType == XML_ELEMENT_NODE && $node->nodeName == 'description') { $GLOBALS['arNodeSet'][] = $node; return; } locateDescription($node->firstChild); $node = $node->nextSibling; } } $dom = new DOMDocument(); $dom->load('course.xml'); $root = $dom->documentElement; $arNodeSet = array(); locateDescription($root->firstChild); foreach ($arNodeSet AS $key=>$node) { print "#$key: ".$node->nodeValue."\n"; }
DOM: Creating a Simple Treedom/create_simple_tree.php $doc = new DOMDocument(); $root = $doc->createElement("tree"); $doc->appendChild($root); $root->setAttribute("att1", "att1 value"); $attr2 = $doc->createAttribute("att2"); $attr2->appendChild($doc->createTextNode("att2 value")); $root->setAttributeNode($attr2); $child = $root->appendChild($doc->createElement("child")); $comment = $doc->createComment("My first Document"); $doc->insertBefore($comment, $root); $pi = $doc->createProcessingInstruction("php", 'echo "Hello World!"'); $root->appendChild($pi); $cdata = $doc->createCdataSection("special chars: & < > '"); $child->appendChild($cdata);
DOM: Simple Tree Output <?xml version="1.0"?> <!--My first Document--> <tree att1="att1 value" att2="att2 value"> <child><![CDATA[special chars: & < > ']]></child> <?php echo "Hello World!"?> </tree>
DOM: Creating an Atom Feeddom/atom_feed_creation.php define('ATOMNS', 'http://www.w3.org/2005/Atom'); $feed_title = "Example Atom Feed"; $alt_url = "http://www.example.org/"; $feed = "http://www.example.org/atom/"; $doc = new DOMDocument("1.0", "UTF-8"); function create_append_Atom_elements($doc, $name, $value=NULL, $parent=NULL) { if ($value) $newelem = $doc->createElementNS(ATOMNS, $name, $value); else $newelem = $doc->createElementNS(ATOMNS, $name); if ($parent) { return $parent->appendChild($newelem); } } $feed = create_append_Atom_elements($doc, 'feed', NULL, $doc); create_append_Atom_elements($doc, 'title', $feed_title, $feed); create_append_Atom_elements($doc, 'subtitle', $feed_title, $feed); create_append_Atom_elements($doc, 'id', $alt_url, $feed); create_append_Atom_elements($doc, 'updated', date('c'), $feed); $doc->formatOutput = TRUE; print $doc->saveXML();
DOM: Creating an Atom Feed Result (initial structure) <?xml version="1.0" encoding="UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>Example Atom Feed</title> <subtitle>Example Atom Feed</subtitle> <id>http://www.example.org/</id> <updated>2006-03-23T01:39:40-05:00</updated> </feed>
DOM: Creating an Atom Feeddom/atom_feed_creation.php $entry = create_append_Atom_elements($doc, 'entry', NULL, $feed); $title = create_append_Atom_elements($doc, 'title', 'My first entry', $entry); $title->setAttribute('type', 'text'); $link = create_append_Atom_elements($doc, 'link', NULL, $entry); $link->setAttribute('type', 'text/html'); $link->setAttribute('rel', 'alternate'); $link->setAttribute('href', 'http://www.example.org/entry-url'); $link->setAttribute('title', 'My first entry'); $author = create_append_Atom_elements($doc, 'author', NULL, $entry); create_append_Atom_elements($doc, 'name', 'Rob', $author); create_append_Atom_elements($doc, 'id', 'http://www.example.org/entry-guid', $entry); create_append_Atom_elements($doc, 'updated', date('c'), $entry); create_append_Atom_elements($doc, 'published', date('c'), $entry); $content = create_append_Atom_elements($doc, 'content', NULL, $entry); $cdata = $doc->createCDATASection('This is my first Atom entry!<br />More to follow'); $content->appendChild($cdata); $doc->formatOutput = TRUE; print $doc->saveXML();
DOM: Creating an Atom FeedResultdom/atomoutput.xml <?xml version="1.0" encoding="UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>Example Atom Feed</title> <subtitle>Example Atom Feed</subtitle> <id>http://www.example.org/</id> <updated>2006-03-23T01:53:59-05:00</updated> <entry> <title type="text">My first entry</title> <link type="text/html" rel="alternate" href="http://www.example.org/entry-url" title="My first entry"/> <author> <name>Rob</name> </author> <id>http://www.example.org/entry-guid</id> <updated>2006-03-23T01:53:59-05:00</updated> <published>2006-03-23T01:53:59-05:00</published> <content><![CDATA[This is my first Atom entry!<br />More to follow]]></content> </entry> </feed>
DOM: Document Editingdom/editing.php $dom->load('atomoutput.xml'); $child = $dom->documentElement->firstChild; while($child && $child->nodeName != "entry") { $child = $child->nextSibling; } if ($child && ($child = $child->firstChild)) { while($child && $child->nodeName != "title") { $child = $child->nextSibling; } if ($child) { $child->setAttribute('type', 'html'); $text = $child->firstChild; $text->nodeValue = "<em>My first entry</em>"; while($child) { if ($child->nodeName == "updated") { $text = $child->firstChild; $text->nodeValue = date('c'); break; } $child = $child->nextSibling; } } } print $dom->saveXML();
DOM: Editingdom/new_atomoutput.xml <?xml version="1.0" encoding="UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>Example Atom Feed</title> <subtitle>Example Atom Feed</subtitle> <id>http://www.example.org/</id> <updated>2006-03-23T01:53:59-05:00</updated> <entry> <title type="html"><em>My first entry</em></title> <link type="text/html" rel="alternate" href="http://www.example.org/entry-url" title="My first entry"/> <author> <name>Rob</name> </author> <id>http://www.example.org/entry-guid</id> <updated>2006-03-23T02:29:22-05:00</updated> <published>2006-03-23T01:53:59-05:00</published> <content><![CDATA[This is my first Atom entry!<br />More to follow]]></content> </entry> </feed>
DOM: Document Modificationdom/modify.php /* These will work */ $children = $entry->childNodes; $length = $children->length - 1; for ($x=$length; $x >=0; $x--) { $entry->removeChild($children->item($x)); } OR $elem = $entry->cloneNode(FALSE); $entry->parentNode->replaceChild($elem, $entry); /* Assume $entry refers to the first entry element within the Atom document */ while ($entry->hasChildNodes()) { $entry->removeChild($entry->firstChild); } OR $node = $entry->lastChild; while($node) { $prev = $node->previousSibling; $entry->removeChild($node); $node = $prev; } /* This Will Not Work! */ foreach($entry->childNodes AS $node) { $entry->removeChild($node); }
DOM and Namespaces <xsd:complexType xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" name="ArrayOfint"> <xsd:complexContent> <xsd:restriction base="soapenc:Array"> <xsd:attribute ref="soapenc:arrayType" wsdl:arrayType="xsd:int[ ]"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType>
Dom and Namepsacesdom/namespace.php define("SCHEMA_NS", "http://www.w3.org/2001/XMLSchema"); define("WSDL_NS", "http://schemas.xmlsoap.org/wsdl/"); $dom = new DOMDocument(); $root = $dom->createElementNS(SCHEMA_NS, "xsd:complexType"); $dom->appendChild($root); $root->setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:wsdl", WSDL_NS); $root->setAttribute("name", "ArrayOfint"); $content = $root->appendChild(new DOMElement("xsd:complexContent", NULL, SCHEMA_NS)); $restriction = $content->appendChild(new DOMElement("xsd:restriction", NULL, SCHEMA_NS)); $restriction->setAttribute("base", "soapenc:Array"); $attribute = $restriction->appendChild(new DOMElement("xsd:attribute", NULL, SCHEMA_NS)); $attribute->setAttribute("ref", "soapenc:arrayType"); $attribute->setAttributeNS(WSDL_NS, "wsdl:arrayType", "xsd:int[]");
DOM and Xpathdom/xpath/dom-xpath.xml <store> <books> <rare> <book qty="4"> <name>Cannery Row</name> <price>400.00</price> <edition>1</edition> </book> </rare> <classics> <book qty="25"> <name>Grapes of Wrath</name> <price>12.99</price> </book> <book qty="25"> <name>Of Mice and Men</name> <price>9.99</price> </book> </classics> </books> </store>
DOM and Xpathdom/xpath/dom-xpath.php $doc = new DOMDocument(); $doc->load('dom-xpath.xml'); $xpath = new DOMXPath($doc); $nodelist = $xpath->query("//name"); print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\n"; $nodelist = $xpath->query("//name[ancestor::rare]"); print "Last Rare Book Title: ".$nodelist->item($nodelist->length - 1)->nodeValue."\n"; $inventory = $xpath->evaluate("sum(//book/@qty)"); print "Total Books: ".$inventory."\n"; $inventory = $xpath->evaluate("sum(//classics/book/@qty)"); print "Total Classic Books: ".$inventory."\n"; $inventory = $xpath->evaluate("count(//book[parent::classics])"); print "Distinct Classic Book Titles: ".$inventory."\n";
DOM and Xpath Results /* $nodelist = $xpath->query("//name") $nodelist->item($nodelist->length - 1)->textContent */ Last Book Title: Of Mice and Men /* $xpath->query("//name[ancestor::rare]"); $nodelist->item($nodelist->length - 1)->nodeValue */ Last Rare Book Title: Cannery Row /* $xpath->evaluate("sum(//book/@qty)") */ Total Books: 54 /* $xpath->evaluate("sum(//classics/book/@qty)") */ Total Classic Books: 50 /* $xpath->evaluate("count(//book[parent::classics])") */ Distinct Classic Book Titles: 2
DOM and Xpath w/Namespaces dom/xpath/dom-xpathns.xml <store xmlns="http://www.example.com/store" xmlns:bk="http://www.example.com/book"> <books> <rare> <bk:book qty="4"> <bk:name>Cannery Row</bk:name> <bk:price>400.00</bk:price> <bk:edition>1</bk:edition> </bk:book> </rare> <classics> <bk:book qty="25"> <bk:name>Grapes of Wrath</bk:name> <bk:price>12.99</bk:price> </bk:book> <bk:book qty="25" xmlns:bk="http://www.example.com/classicbook"> <bk:name>Of Mice and Men</bk:name> <bk:price>9.99</bk:price> </bk:book> </classics> <classics xmlns="http://www.example.com/ExteralClassics"> <book qty="33"> <name>To Kill a Mockingbird</name> <price>10.99</price> </book> </classics> </books> </store>
DOM and Xpath w/Namespacesdom/xpath/dom-xpathns.php $nodelist = $xpath->query("//name"); print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\n"; // Last Book Title: /* Why empty? */ $nodelist = $xpath->query("//bk:name"); print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\n"; // Last Book Title: Grapes of Wrath /* Why not "Of Mice and Men" */ $nodelist = $xpath->query("//bk:name[ancestor::rare]"); print "Last Rare Book Title: ".$nodelist->item($nodelist->length - 1)->nodeValue."\n"; // Last Rare Book Title: /* Why empty? */ $xpath->registerNamespace("rt", "http://www.example.com/store"); $nodelist = $xpath->query("//bk:name[ancestor::rt:rare]"); print "Last Rare Book Title: ".$nodelist->item($nodelist->length - 1)->nodeValue."\n"; // Last Rare Book Title: Cannery Row $xpath->registerNamespace("ext", "http://www.example.com/ExteralClassics"); $nodelist = $xpath->query("(//bk:name) | (//ext:name)"); print "Last Book Title: ".$nodelist->item($nodelist->length - 1)->textContent."\n"; // Last Book Title: To Kill a Mockingbird