1 / 39

Chapter 5 – Creating Markup with XML

Chapter 5 – Creating Markup with XML.

perreault
Download Presentation

Chapter 5 – Creating Markup with XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5 – Creating Markup with XML Outline5.1 Introduction5.2 Introduction to XML Markup5.3 Parsers and Well-formed XML Documents5.4 Parsing an XML Document with msxml5.5 Characters 5.5.1 Character Set 5.5.2 Characters vs. Markup 5.5.3 While Space, Entity References and Built-in Entities 5.5.4 Using Unicode in an XML Document5.6 Markup5.7 CDATA Sections5.8 XML Namespaces5.9 Case Study: A Day Planner Application

  2. 5.1 Introduction • XML • Technology for creating markup languages • Enables document authors to describe data of any type • Allows creating new tags • HTML limits document authors to fixed tag set

  3. 5.2 Introduction to XML Markup • XML document (intro.xml) • Marks up message as XML • Commonly stored in text files • Extension .xml

  4. 1 <?xml version = "1.0"?> Document begins with declaration that specifies XML version 1.0 2 Comments 3 <!-- Fig. 5.1 : intro.xml --> Element message is child element of root elementmyMessage 4 <!-- Simple introduction to XML markup --> Line numbers are not part of XML document. We include them for clarity. 5 6 <myMessage> 7 <message>Welcome to XML!</message> 8 </myMessage> Fig. 5.1 Simple XML document containing a message. Line numbers are not part of XML document. We include them for clarity.Document begins with declaration that specifies XML version 1.0CommentsElement message is child element of root elementmyMessage

  5. 5.2 Introduction to XML Markup (cont.) • XML documents • Must contain exactly one root element • Attempting to create more than one root element is erroneous • Elements must be nested properly • Incorrect:<x><y>hello</x></y> • Correct:<x><y>hello</y></x>

  6. 5.3 Parsers and Well-formed XML Documents • XML parser • Processes XML document • Reads XML document • Checks syntax • Reports errors (if any) • Allows programmatic access to document’s contents

  7. 5.3 Parsers and Well-formed XML Documents (cont.) • XML document syntax • Considered well formed if syntactically correct • Single root element • Each element has start tag and end tag • Tags properly nested • Attribute (discussed later) values in quotes • Proper capitalization • Case sensitive

  8. 5.3 Parsers and Well-formed XML Documents (cont.) • XML parsers support • Document Object Model (DOM) • Builds tree structure containing document data in memory • Simple API for XML (SAX) • Generates events when tags, comments, etc. are encountered • (Events are notifications to the application)

  9. 5.4 Parsing an XML Document with msxml • XML document • Contains data • Does not contain formatting information • Load XML document into Internet Explorer 5.0 • Document is parsed by msxml. • Places plus (+) or minus (-) signs next to container elements • Plus sign indicates that all child elements are hidden • Clicking plus sign expands container element • Displays children • Minus sign indicates that all child elements are visible • Clicking minus sign collapses container element • Hides children • Error generated, if document is not well formed

  10. Fig. 5.2 XML document shown in IE5.

  11. Fig. 5.3 Error message for a missing end tag.

  12. 5.5 Characters • Character set • Characters that may be represented in XML document • e.g., ASCII character set • Letters of English alphabet • Digits (0-9) • Punctuation characters, such as !, - and ?

  13. 5.5.1 Character Set • XML documents may contain • Carriage returns • Line feeds • Unicode characters (Section 5.5.4) • Enables computers to process characters for several languages

  14. 5.5.2 Characters vs. Markup • XML must differentiate between • Markup text • Enclosed in angle brackets (< and >) • e.g,. Child elements • Character data • Text between start tag and end tag • e.g., Fig. 5.1, line 7: Welcome to XML!

  15. 5.5.3 White Space, Entity References and Built-in Entities • Whitespace characters • Spaces, tabs, line feeds and carriage returns • Significant (preserved by application) • Insignificant (not preserved by application) • Normalization • Whitespace collapsed into single whitespace character • Sometimes whitespace removed entirely <markup>This is character data</markup> after normalization, becomes <markup>This is character data</markup>

  16. 5.5.3 White Space, Entity References and Built-in Entities (cont.) • XML-reserved characters • Ampersand (&) • Left-angle bracket (<) • Right-angle bracket (>) • Apostrophe (’) • Double quote (”) • Entity references • Allow to use XML-reserved characters • Begin with ampersand (&) and end with semicolon (;) • Prevents from misinterpreting character data as markup

  17. 5.5.3 White Space, Entity References and Built-in Entities (cont.) • Build-in entities • Ampersand (&amp;) • Left-angle bracket (&lt;) • Right-angle bracket (&gt;) • Apostrophe (&apos;) • Quotation mark (&quot;) • Mark up characters “<>&” in element message <message>&lt;&gt;&amp;</message>

  18. 5.5.4 Using Unicode in an XML Document • XML Unicode support • e.g., Fig. 5.4 displays Arabic words • Arabic characters • represented by entity references for Unicode characters

  19. 1 <?xml version = "1.0"?> Document type definition (DTD) defines document structure and entities 2 3 <!-- Fig. 5.4 : lang.xml --> 4 <!-- Demonstrating Unicode --> 5 Root element welcome contains child elements from and subject 6 <!DOCTYPE welcome SYSTEM "lang.dtd"> 7 Sequence of entity references for Unicode characters in Arabic alphabet 8 <welcome> 9 <from> lang.dtd defines entities assoc and text 10 11 <!-- Deitel and Associates --> 12 &#1583;&#1575;&#1610;&#1578;&#1614;&#1604; 13 &#1571;&#1606;&#1583; 14 15 <!-- entity --> 16 &assoc; 17 </from> 18 19 <subject> 20 21 <!-- Welcome to the world of Unicode --> 22 &#1571;&#1607;&#1604;&#1575;&#1611; 23 &#1576;&#1603;&#1605; 24 &#1601;&#1610;&#1616; 25 &#1593;&#1575;&#1604;&#1605; 26 27 <!-- entity --> 28 &text; 29 </subject> 30 </welcome> Fig. 5.4 XML document that contains Arabic words Document type definition (DTD) defines document structure and entitiesRoot element welcome contains child elements from and subjectSequence of entity references for Unicode characters in Arabic alphabetlang.dtd defines entities assoc and text

  20. Fig. 5.4 XML document that contains Arabic words.

  21. 5.6 Markup • XML element markup • Consists of • Start tag • Content • End tag • All elements must have corresponding end tag<img src =“img.gif”>is correct in HTML, but not XML • XML requires end tag or forward slash (/) for termination <img src =“img.gif”></img>or <img src =“img.gif”/>is correct XML syntax

  22. 5.6 Markup (cont.) • Elements • Define structure • May (or may not) contain content • Child elements, character data, etc. • Attributes • Describe elements • Elements may have associated attributes • Placed within element’s start tag • Values are enclosed in quotes • Element car contains attribute doors, which has value “4” <car doors =“4”/>

  23. 5.6 Markup (cont.) • Processing instruction (PI) • Passed to application using XML document • Provides application-specific document information • Delimited by <? and ?>

  24. 1 <?xml version = "1.0"?> Processing instruction specifies stylesheet (discussed in Chapter 12) 2 3 <!-- Fig. 5.5 : usage.xml --> 4 <!-- Usage of elements and attributes --> Root element book contains child elements title, author, chapters and media 5 6 <?xml:stylesheet type = "text/xsl"href = "usage.xsl"?> Element book contains attribute isbn, which has value of 999-99999-9-X 7 8 <book isbn = "999-99999-9-X"> Element chapters contains four child elements, each which contain two attributes 9 <title>Deitel&amp;s XML Primer</title> 10 11 <author> 12 <firstName>Paul</firstName> 13 <lastName>Deitel</lastName> 14 </author> 15 16 <chapters> 17 <preface num = "1" pages = "2">Welcome</preface> 18 <chapter num = "1" pages = "4">Easy XML</chapter> 19 <chapter num = "2" pages = "2">XML Elements?</chapter> 20 <appendix num = "1" pages = "9">Entities</appendix> 21 </chapters> 22 23 <media type = "CD"/> 24 </book> Fig. 5.5 XML document that marks up information about a fictitious book. Processing instruction specifies stylesheet (discussed in Chapter 12)Root element book contains child elements title, author, chapters and mediaElement book contains attribute isbn, which has value of 999-9999-9-XElement chapters contains four child elements, each which contain two attributes

  25. Fig. 5.5 XML document that marks up information about a fictitious book.

  26. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.6: letter.xml --> 4 <!-- Business letter formatted with XML --> 5 6 <letter> 7 8 <contact type = "from"> 9 <name>Jane Doe</name> 10 <address1>Box 12345</address1> 11 <address2>15 Any Ave.</address2> 12 <city>Othertown</city> 13 <state>Otherstate</state> 14 <zip>67890</zip> 15 <phone>555-4321</phone> 16 <flag gender = "F"/> 17 </contact> 18 19 <contact type = "to"> 20 <name>Jane Doe</name> 21 <address1>123 Main St.</address1> 22 <address2></address2> 23 <city>Anytown</city> 24 <state>Anystate</state> 25 <zip>12345</zip> 26 <phone>555-1234</phone> 27 <flag gender = "M"/> 28 </contact> 29 Fig. 5.6 XML document that marks up a letter.

  27. 30 <salutation>Dear Sir:</salutation> 31 32 <paragraph>It is our privilege to inform you about our new 33 database managed with <bold>XML</bold>. This new system 34 allows you to reduce the load on your inventory list 35 server by having the client machine perform the work of 36 sorting and filtering the data.</paragraph> 37 38 <paragraph>The data in an XML element is normalized, so 39 plain-text diagrams such as 40 /---\ 41 | | 42 \---/ 43 will become gibberish.</paragraph> 44 45 <closing>Sincerely</closing> 46 <signature>Ms. Doe</signature> 47 48 </letter> Fig. 5.6 XML document that marks up a letter. (Part 2)

  28. Fig. 5.6 XML document that marks up a letter.

  29. 5.7 CDATA Sections • CDATA sections • May contain text, reserved characters and whitespace • Reserved characters need not be replaced by entity references • Not processed by XML parser • Commonly used for scripting code (e.g., JavaScript) • Begin with <![CDATA[ • Terminate with ]]>

  30. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.7 : cdata.xml --> Entity references required if not in CDATA section 4 <!-- CDATA section containing C++ code --> 5 6 <book title = "C++ How to Program" edition = "3"> XML does not process CDATA section 7 8 <sample> 9 // C++ comment Note the simplicity offered by CDATA section 10 if ( this-&gt;getX() &lt; 5 &amp;&amp; value[ 0 ] != 3 ) 11 cerr &lt;&lt; this-&gt;displayError(); 12 </sample> 13 14 <sample> 15 <![CDATA[ 16 17 // C++ comment 18 if ( this->getX() < 5 && value[ 0 ] != 3 ) 19 cerr << this->displayError(); 20 ]]> 21 </sample> 22 23 C++ How to Program by Deitel &amp; Deitel 24 </book> Fig. 5.7 Using a CDATA section. Entity references required if not in CDATA sectionXML does not process CDATA sectionNote the simplicity offered by CDATA section

  31. Fig. 5.7 Using a CDATA section

  32. 5.8 XML Namespaces • Naming collisions • Two different elements have same name <subject>Math</subject> <subject>Thrombosis</subject> • Namespaces • Differentiate elements that have same name<school:subject>Math</school:subject> <medical:subject>Thrombosis</medical:subject> • school and medical are namespace prefixes • Prepended to elements and attribute names • Tied to uniform resource identifier (URI) • Series of characters for differentiating names

  33. 5.8 XML Namespaces (cont.) • Creating namespaces • Use xmlns keyword xmlns:text =“urn:deitel:textInfo” xmlns:image =“urn:deitel:imageInfo” • Creates two namespace prefixes text and image • urn:deitel:textInfo is URI for prefix text • urn:deitel:imageInfo is URI for prefix image • Default namespaces • Child elements of this namespace do not need prefix xmlns =“urn:deitel:textInfo”

  34. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.8 : namespace.xml --> Use prefix text to describe elements file and description Element directory contains two namespace prefixes 4 <!-- Namespaces --> 5 6 <directory xmlns:text = "urn:deitel:textInfo" 7 xmlns:image = "urn:deitel:imageInfo"> Apply prefix text to describe elements file, description and size 8 9 <text:file filename = "book.xml"> 10 <text:description>A book list</text:description> 11 </text:file> 12 13 <image:file filename = "funny.jpg"> 14 <image:description>A funny picture</image:description> 15 <image:size width = "200" height = "100"/> 16 </image:file> 17 18 </directory> Fig. 5.8 Listing for namespace.xml. Element directory contains two namespace prefixesUse prefix text to describe elements file and descriptionApply prefix text to describe elements file, description and size

  35. 1 <?xml version = "1.0"?> urn:deitel:textInfo is default namespace 2 3 <!-- Fig. 5.9 : defaultnamespace.xml --> Element file is in default namespace 4 <!-- Using Default Namespaces --> 5 Specify namespace 6 <directory xmlns = "urn:deitel:textInfo" 7 xmlns:image = "urn:deitel:imageInfo"> 8 9 <file filename = "book.xml"> 10 <description>A book list</description> 11 </file> 12 13 <image:file filename = "funny.jpg"> 14 <image:description>A funny picture</image:description> 15 <image:size width = "200"height = "100"/> 16 </image:file> 17 18 </directory> Fig. 5.9 Using default namespaces. urn:deitel:text-Info is default namespaceElement file is in default namespaceSpecify namespace

  36. 5.9 Case Study: A Day Planner Application • Markup for Day-Planner application • Scheduling appointments and task • Date • Time • Appointment type

  37. 1 <?xml version = "1.0"?> 2 3 <!-- Fig. 5.10 : planner.xml --> Root element planner holds all appointments 4 <!-- Day Planner XML document --> 5 date elements store specific dates with attributes month and day 6 <planner> 7 8 <year value = "2000"> note elements mark up appointments 9 10 <date month = "7" day = "15"> 11 <note time = "1430">Doctor&apos;s appointment</note> 12 <note time = "1620">Physics class at BH291C</note> 13 </date> 14 15 <date month = "7" day = "4"> 16 <note>Independence Day</note> 17 </date> 18 Fig. 5.10 Day planner XML document planner.xml. Root element planner holds all appointments date elements store specific dates with attributes month and daynote elements mark up appointments

  38. 19 <date month = "7" day = "20"> 20 <note time = "0900">General Meeting in room 32-A</note> 21 </date> 22 23 <date month = "7"day = "20"> 24 <note time = "1900">Party at Joe&apos;s</note> 25 </date> 26 27 <date month = "7" day = "20"> 28 <note time = "1300">Financial Meeting in room 14-C</note> 29 </date> 30 31 </year> 32 33 </planner> Fig. 5.10 Day planner XML document planner.xml. (Part 2)

  39. Fig. 5.11 Application that uses the day planner.

More Related