390 likes | 511 Views
DIG 3134 – Internet Software Design. Lecture 17 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida. Original image* by Moshell et al. Imagery is fromWikimedia except where marked with *. Licensing is listed. Purposes of XML:. Make data more easily used
E N D
DIG 3134 – Internet Software Design Lecture 17 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al . Imagery is fromWikimedia except where marked with *. Licensing is listed.
Purposes of XML: • Make data more easily used • Make data last longer (across generations of technology) • Strategy of XML: • Provide a basis for creating 'dialects' for special purposes • - Thus, XML is a meta-language • Provide tools you can use, rather than re-invent • Structure of XML: • Inject <tags> into text files
XML Syntax: Declaration: <?xml version="1.0" encoding="UTF-8"> Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> </transcript> </student>
XML Syntax: Declaration: <?xml version="1.0" encoding="UTF-8"> content Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> </transcript> </student>
XML Syntax: Declaration: <?xml version="1.0" encoding="UTF-8"> Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> </transcript> </student> attribute
XML Syntax: Declaration: <?xml version="1.0" encoding="UTF-8"> Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> </transcript> </student> name value
XML Syntax: Declaration: <?xml version="1.0" encoding="UTF-8"> Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> </transcript> </student> name value
Real World Example: E-commerce (Euro processing) in a PHP application function sendResponse($status, $statusmessage, $neworderid, $batchid) { echo '<?xml version="1.0" encoding="utf-8"?>'; echo "<responsemessage>"; echo "<status>".$status."</status>"; echo "<statusmessage>".$statusmessage."</statusmessage>"; echo "<neworderid>".$neworderid."</neworderid>"; echo "<batchid>".$batchid."</batchid>"; echo "</responsemessage>"; }
This raises a Question; How does one represent the 'grammar' of an element ... e. g. A transcript will consist of zero or more courses. Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> </transcript> </student>
This raises a Question; How does one represent the 'grammar' of an element ... e. g. A transcript will consist of zero or more courses. Nested elements: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> <transcript> <course semester="Fall 06">DIG 4921c</course> <course semester="Fall 06">DIG 4526 </course> <gradepoint>3.62</gradepoint> </transcript> </student> This will be done via a SCHEMA.
Two kinds of "grammaticality" 1. Well-formedness (standard XML) 2. Validity (based on a schema) Well-formed: • one ROOT ELEMENT - e. g. <student> ... </student> per document • all non-empty elements are delimited with start & end tags. • Empty elements are delimited properly • - intentionally empty placemarkers: <thisway /> • - temporarily empty placemarkers: <likethis></likethis> • All attribute values are quoted. • Tags do not overlap. • Document complies to its character set definition.
Schemas represent a particular "language" subset of XML We're not going to explore Schemas in this lecture. But we do have ANOTHER issue to mention: NAMESPACES ... consider this piece of XML: <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major>
Namespaces <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> What's a 'major'? What are the legal values? What is its relationship to a particular university? Does it have a relationship to any national or world standards?
Namespaces <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major>Digital Media</major> What's a 'major'? What are the legal values? What is its relationship to a particular university? Does it have a relationship to any national or world standards?
Namespaces <student> <person> <last-name>Wilson</last-name> <first-name>Henry</first-name> <address>122 Smith Road</address> </person> <major xmlns="http://example.org/academicmajor"> Digital Media</major> At the listed "URI" (similar to a URL) would be found a description of what kinds of things can be put into the 'major' field. This allows people to establish and share standards. (There is no such thing as example.org ... it's a placeholder)
Namespaces: Another example <root> <h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table><f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table></root>
XML into PHP: Start simple... <person> <lastname>Wilson</lastname> <firstname>Henry</firstname> <address>122 Smith Road</address> </person> Then I ran this code to suck it into PHP and look at it: $xml = simplexml_load_file($filename); print "Raw xml:"; print_r($xml); print "<br /><br />";
And the results: <person> <lastname>Wilson</lastname> <firstname>Henry</firstname> <address>122 Smith Road</address> </person> The resulting object looked like this ( via print_r ) Raw xml:SimpleXMLElement Object ( [lastname] => Wilson [firstname] => Henry [address] => 122 Smith Road )
To access some piece (e. g. lastname): <person> <lastname>Wilson</lastname> <firstname>Henry</firstname> <address>122 Smith Road</address> </person> $lastname=$xml->lastname; // $xml is an object print "ln=$lastname <br />"; print "<br /><br />"; And the output was ln=Wilson
<student> <person> <lastname>Wilson</lastname> <firstname>Henry</firstname> <address>122 Smith Road</address> </person> <transcript> <course level='undergrad'> <title>DIG 3134</title> <semester>Fall 2011</semester> <grade>A</grade> </course> <course> <title>DIG 3353</title> <semester>Spring 2011</semester> <grade>C</grade> </course> </transcript> </student> A more complex record: see xample.php (as .txt)
The Battleship XML This example represents two ships for the 'black' team. <ocean> <ship number='1'> <x>A</x> <y>1</y> <orientation>horizontal</orientation> <type>black</type> </ship> <ship number='2'> <x>H</x> <y>1</y> <orientation>vertical</orientation> <type>black</type> </ship> </ocean>
The Battleship XML Attribute- Value pair This example represents two ships for the 'black' team. <ocean> <ship number='1'> <x>A</x> <y>1</y> <orientation>horizontal</orientation> <type>black</type> </ship> <ship number='2'> <x>H</x> <y>1</y> <orientation>vertical</orientation> <type>black</type> </ship> </ocean>
How do we read XML? $xml=simplexml_load_file($shipfilename); But what is now in the variable '$xml' ? To find out, we use the print_r function. print_r ($xml); -- what do we get?
AMess! SimpleXMLElement Object ( [ship] => Array ( [0] => SimpleXMLElement Object ( [@attributes] => Array ( [number] => 1 ) [x] => A [y] => 1 [orientation] => horizontal [type] => black ) [1] => SimpleXMLElement Object ( [@attributes] => Array ( [number] => 2 ) [x] => H [y] => 1 [orientation] => vertical [type] => black ) ) )
Prettyprint it: SimpleXMLElement Object ( [ship] => Array ( [0] => SimpleXMLElement Object ( [@attributes] => Array ( [number] => 1 ) [x] => A [y] => 1 [orientation] => horizontal [type] => black ) [1] => SimpleXMLElement Object ( [@attributes] => Array ( [number] => 2 ) [x] => H [y] => 1 [orientation] => vertical [type] => black ) ) )
Prettyprint it: If you "View Source" on a print_r output, you will see the prettyprinted version. This is easy with Firefox (Command-U).
Getting at the data $ships=$xml->ship; foreach ($ships as $ship) { $xc=$ship->x; // x-character (like 'A') $xlo=(ord(strtoupper($xc))-64); // get it? $ylo=$ship->y; // y-smallest number (like 1) $orientation=$ship->orientation; // like 'vertical' // more to come
Checking the data $xc=$ship->x; // x-character (like 'A') $xlo=(ord(strtoupper($xc))-64); // get it? $ylo=$ship->y; // y-smallest number (like 1) $orientation=$ship->orientation; // like 'vertical' logprint("xc=$xc,ylo=$ylo,or=$orientation",5); output: xc=A, ylo=1, or=horizontal
Ominous storm clouds $xc=$ship->x; // x-character (like 'A') $xlo=(ord(strtoupper($xc))-64); // get it? $ylo=$ship->y; // y-smallest number (like 1) $orientation=$ship->orientation; // like 'vertical' logprint("xc=$xc,ylo=$ylo,or=$orientation",5); output: xc=A, ylo=1, or=horizontal everything looks normal and reasonable .. BUT ... (cue the scary organ music….) TROUBLE ahead!
Meanwhile … getting at the data $shipattributes=$ship->attributes( ); // eh? // The attributes of the <ship> element are // returned by a special built-in method, // in the form of an array. We saw: [ship] => Array ( [0] => SimpleXMLElement Object ( [@attributes] => Array ( [number] => 1 ) // so, to 'peel' the info, we access an array element. $shipnumber=$shipattributes['number'];
Continuing the story … $ships=$xml->ship; foreach ($ships as $ship) { // … further down the foreach loop: $type=$ship->type; if ($type=='black') $fillcolor=BLACK; else $fillcolor=GOLD; if ($orientation=='horizontal') { // and we get ready to draw a ship into $Grid
Then something WEIRD happened for ($x=$xlo; $x<=$xlo+4; $x++) { $Grid[$x][$y]=$fillcolor; And this is what happened: Warning: Illegal offset type in /Applications/MAMP/htdocs/DIG3134/battleship/battleship12.php on line 428 What's that? An 'offset' is an index, like [$x] or [$y] What is wrong with $x or $y? So – I whip out my trusty print_r:
Continuing the story … print "xc is "; print_r($xc); print "ylo is "; print_r($ylo); xc is SimpleXMLElement Object ( [0] => A ) ylo is SimpleXMLElement Object ( [0] => 1 ) What?? When I printed xc, it just looked like A So:: the moral is, you get OBJECTS, all the way.
PHP = -- You always gotta watch it, and be ready to jump --
Fixing the problem $y=$ylo+0; // looks weird. Add zero? Why? print "y is "; print_r($y); y is 1 // problem solved. How did this work? PHP automatically assigns a data type to new variables like $y, based on the types of incoming vars. $y = $ylo + 0; (object + number) results in a number.
Creating XML from Objects <?php // example.php -- here's one way to create a complex string. $xmlstr = <<XML <?xml version='1.0' standalone='yes'?> <movie> <title>PHP: Behind the Parser</title> <characters> <character> <name>Ms. Coder</name> <actor>Onlivia Actora</actor> </character> </characters> <plot> To save space, nothing here. </plot> </movie> XML;?>
Creating XML from Objects <?phpinclude 'example.php';$movie = new SimpleXMLElement($xmlstr);$character = $movie->characters->addChild('character');$character->addChild('name', 'Mr. Parser');$character->addChild('actor', 'John Doe');$rating = $movie->addChild('rating', 'PG');$rating->addAttribute('type', 'mpaa');$stringout= $movie->asXML(); // then write out text file.?>
Take-away: 1. print_r is your friend, in times of confusion. It can print (and "explain") any PHP variable. 2. simplexml is an easy-to-use tool, but you gotta understand objects to use it. 3. attribute-value pairs are accessed via a special method, not simply as object variables.
Looking forward:* Creating useful outputs(PDF, XLS)* Reading XLS files directly* Communicating with otherwebsites (CURL)* Advanced topics: - recursion, JSON