1 / 102

We will cover …

We will cover …. What is XML? Components of an XML Document Document Type Definition (DTD) XML Data Islands Parsing XML and DOM XML presentation with CSS, XSL and XSLT XPath, XML and Database Integration Why use XML? Creating your own XML vocabulary Review of XML Applications and Tools

dessa
Download Presentation

We will cover …

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. We will cover… • What is XML? • Components of an XML Document • Document Type Definition (DTD) • XML Data Islands • Parsing XML and DOM • XML presentation with CSS, XSL and XSLT • XPath, XML and Database Integration • Why use XML? • Creating your own XML vocabulary • Review of XML Applications and Tools • XML Resources (c) Nizar Mabroukeh 2001

  2. What is XML?

  3. Markup Languages • SGML: • Standard Generalized Markup Language • Mother of Markup Languages • HTML • Most popular presentation language for web • XML: • Draws heavily on the merits & shortcomings of HTML & SGML (c) Nizar Mabroukeh 2001

  4. Issues with HTML Merits: • Very easy to use & learn • Presentation technology • It is the most popular Shortcomings: • NOT a data technology • Poor Searching • There is no Intelligence of content/data • We loose meaning association with content • Data cannot be represented hierarchically • Limited set of tags (c) Nizar Mabroukeh 2001

  5. How does XML look? • Simple XML data would look like <book> <title> XML Tech </title> <author>YAG</author> <level> Freshman </level> </book> <book> :Called the root node (c) Nizar Mabroukeh 2001

  6. XML HTML • Similar in appearance • Both are based on SGML BUT… • XML describes data • HTML displays data (c) Nizar Mabroukeh 2001

  7. XML (eXtensible Markup Language)

  8. What Is XML? • “XML is a • platform-independent, • self-describing, • expandable, • standard data exchange format • that can be used either independently or embedded and used within other solutions.” (c) Nizar Mabroukeh 2001

  9. Platform Independent • Windows • Unix • Macintosh • Mainframe (c) Nizar Mabroukeh 2001

  10. Self-Describing • Example: <DATE>July 26, 1998</DATE> • Describes the information, not the presentation. Format flexible. (c) Nizar Mabroukeh 2001

  11. Expandable = Extensible • HTML has a fixed set of tags <H1>, <B>, <PRE> • XML lets you have your own tags <dangerous-substance>, <Shakespearean-character>, <cash-equivalent> (c) Nizar Mabroukeh 2001

  12. Standard • W3C (World Wide Web Consortium) www.w3c.org • XML 1.0 specification was issued as standards-based text format for interchange of data as of February 1998. • W3C XML Working Group designed XML as a simplified subset of SGML (c) Nizar Mabroukeh 2001

  13. Standard • XML specification does not define any particular tag names (like HTML), instead it defines general syntactic rules enabling developers to create their own domain-specific vocabularies of tags. (c) Nizar Mabroukeh 2001

  14. balance sheet total-assets asset (current) amount (1998) $41,000,000 Context • Greater context to the information • Tree structure is natural in XML <balance-sheet> <total-assets> <asset-type="current"> <amount-period="1998"> <amount> $41,000,000 </amount> </amount-period> </asset-type> </total-assets> </balance-sheet> (c) Nizar Mabroukeh 2001

  15. Freedom • Extensible markup language • Customized tags • Tags give meaning to the content • Separates data from style (c) Nizar Mabroukeh 2001

  16. Why XML? • Derived as a subset of SGML • Allows you to define your own tags XML: <author>YAG</author> HTML: <B> YAG </B> • Provides meaningful & readable data • Meaning searches can be performed • Much simpler than SGML • SGML spec = 300 pages, XML = 33 pages • Purely a Data Technology • Supports compound documents. (c) Nizar Mabroukeh 2001

  17. XML Advantages • Web based • Extensible • License-free • Platform independent • Single end-to-end IT solution for electronic information exchanges (c) Nizar Mabroukeh 2001

  18. XML Documents Authoring XML

  19. XML Elements • An XML element is made up of a start tag, an end tag, and data in between. The start and end tags describe the data within the tags, which is considered the value of the element. • For example, the following XML element is a <chairman> element with the value “Sadiq Sait" <chairman>Sadiq Sait</chairman> • Elements can be empty, to represent an empty element: <chairman/> (c) Nizar Mabroukeh 2001

  20. <book> Codebook 6.0 </book> Opening Tag Content Closing Tag Element

  21. XML Attributes • An element can optionally contain one or more attributes. An attribute is a name-value pair separated by an equal sign (=). Example: <CITY ZIP=“31261">Dhahran</CITY> • Here, ZIP=“31261" is an attribute of the <CITY> element. Attributes are used as meta information (c) Nizar Mabroukeh 2001

  22. XML is… …Case-sensitive (c) Nizar Mabroukeh 2001

  23. Parts of an XML document Version Declaration Document Type Definition (DTD) <root> BODY </root> (c) Nizar Mabroukeh 2001

  24. Version Declaration <?xml version=“1.0” encoding=“UTF-8” standalone=“yes”?> • Encoding: • Supports Unicode 8, Unicode 16 & Others • In short: Provides for multi-lingual data • Standalone: • Indicates whether the document has any markup declarations that are external to the document (c) Nizar Mabroukeh 2001

  25. XML data • This is how XML data would look like (the body of the document): <books> <book> <name>Codebook 6.0 </name> <author> YAG </author> <level> Intermediate & Advanced </level> </book> <book> <name>Jave for Beginners </name> <author> Dale </author> <level> Beginner </level> </book> </books> (c) Nizar Mabroukeh 2001

  26. ONE Root element ALL tags start AND end Tags cannot overlap and are case sensitive Attribute values enclosed in quotes Attributes not repeated in an element FIRST item must be <books> <book></book> <book ISBN=“21-458-65-0”> <para id=“1” s:id=“4”> <?xml version=“1.0”?> XML Document Rules (c) Nizar Mabroukeh 2001

  27. Two types of XML documents • Well-formed XML documents • Valid XML documents (c) Nizar Mabroukeh 2001

  28. Well-formed document • Must contain one or more elements • Must contain a uniquely named root element • All other elements within the root element must be nested correctly • An XML parser will reject malformed documents(the method of rejection will vary by parser author) • Documents that contain XML and HTML tags are common • HTML within an XML document must be well-formed (c) Nizar Mabroukeh 2001

  29. Valid XML document • The XML document must be well formed • Should contain a Document Type Definition • DTD is a schema which contains the constraints for the XML document • It contains Element definitions and their Attributes • Attributes should comply with the following rules • Cannot contain <, & or a single ‘ or ‘’. • Elements must be nested correctly (c) Nizar Mabroukeh 2001

  30. Document Type Definition • DTD is a text document that defines the lexicon of legal names for tags in a particular XML vocabulary • It also defines how tags should be nested • It can be written as code inside the XML file or specified externally as a separate text file with extension .dtd (c) Nizar Mabroukeh 2001

  31. Sample DTD <!-- Uses EBNF (Extended Backus Naur Form) --> <!DOCTYPE book [ <!ELEMENT book(name,author,level)+> <!ELEMENT name(#PCDATA)> <!ELEMENT author(#PCDATA)> <!ELEMTNT level(#PCDATA)> <!ATTLIST author email CDATA #IMPLIED> ]> • DTD may be specified externally with .dtd extension <!DOCTYPE book SYSTEM “book.dtd”> (c) Nizar Mabroukeh 2001

  32. More on DTD • Special software can help you build your DTD document visually instead of having to write all this weird code, example software package is “XML Authority” from Extensibility • An XML document is associated with a corresponding DTD document for validation using the <!DOCTYPE…> tag (c) Nizar Mabroukeh 2001

  33. Why use a DTD? • Application independent way of sharing data • Industries or trading parties can agree on a standard for interchanging data • Verification that data received from trading parties is valid. (c) Nizar Mabroukeh 2001

  34. Complete XML document <?xml version=“1.0” ?> <!DOCTYPE book [ <!ELEMENT book(name,author,level?)+> <!ELEMENT name(#PCDATA)> <!ELEMENT author(#PCDATA)> <!ELEMTNT level(#PCDATA)> ]> <book> <name>Codebook 6.0 </name> <author> YAG </author> <level> Intermediate & Advanced </level> </book> (c) Nizar Mabroukeh 2001

  35. OR <?xml version=“1.0” ?> <!DOCTYPE book SYSTEM “book.dtd”> <book> <name>Codebook 6.0 </name> <author> YAG </author> <level> Intermediate & Advanced </level> </book> (c) Nizar Mabroukeh 2001

  36. XML Document Pluses • Tightly Structured • Extensible • Easily models data • Useful for applications and transferbetween applications • Interchangeable (c) Nizar Mabroukeh 2001

  37. XML Data Islands XML inside HTML

  38. XML Data Islands • A data island is an XML document that exists within an HTML page. • It allows you to script against the XML document without having to load it through script or through the <OBJECT> tag. • Almost anything that can be in a well-formed XML document can be inside a data island (c) Nizar Mabroukeh 2001

  39. How to create XML data island • The XML for a data island in HTML can be either: • Inline using ID, • or called from an outside xml file using SRC, • or created using a <script> tag (c) Nizar Mabroukeh 2001

  40. Inline data island • The <XML> element marks the beginning of the data island, and its ID attribute provides a name that you can use to reference the data island. <XML ID="XMLID"> <customer> <name>Mark Hanson</name><custID>81422</custID> </customer> </XML> (c) Nizar Mabroukeh 2001

  41. XML referenced from outside file • referenced through a SRC attribute on the <XML> tag: <XML ID="XMLID" SRC="customer.xml"></XML> (c) Nizar Mabroukeh 2001

  42. Created using <script> tag <SCRIPT LANGUAGE="xml" ID="XMLID"> <customer> <name>Mark Hanson</name><custID>81422</custID> </customer> </SCRIPT > (c) Nizar Mabroukeh 2001

  43. Parsing XML

  44. What is XML Parsing • For a computer program to access the structured information in the document in a meaningful way, parsing is required • The parser first reads the stream of characters and recognizes the syntactic details of elements, attributes and text in the document • Then, the parser exposes the hierarchical set of information in the document as a tree of related elements, attributes and text items (c) Nizar Mabroukeh 2001

  45. The logical tree of information items created after parsing the XML document, is called the Information Set or Infoset • This can then be manipulated in different ways and data extracted for usage in applications, databases,…etc (c) Nizar Mabroukeh 2001

  46. XML Parsers • Always check for well-formedness • Can be validating or non-validating • Validation required association with DTD document • Included in Microsoft Internet Explorer 5.0 • Language-neutral programming model • By using W3C XML 1.0 and XML DOM it supports JavaScript, VBScript, Java, C++, Perl (c) Nizar Mabroukeh 2001

  47. Manipulating XML using the DOM • W3C provides a standard API called the Document Object Model (DOM) to access an XML document’s infoset • The DOM API provides a complete set of operations to programmatically manipulate the node tree including navigating the nodes in the hierarchy, creating and appending new nodes, removing nodes, etc. (c) Nizar Mabroukeh 2001

  48. Once you are done with making changes to the node tree you can save it and serialize the infoset back into an XML document xml infoset parsing serialization (c) Nizar Mabroukeh 2001

  49. DOM Properties & Methods • An XML document object is created when an XML data island is loaded and parsed…. and it has Properties & Methods XMLDocument: Returns a reference to the XML DOM exposed by the object documentElement: Returns the root element childNodes: Returns a node list containing children (if any) item(id): Access individual nodes through an index (zero based) text Returns the text content of the node Let’s look at an example… (c) Nizar Mabroukeh 2001

  50. DOM Example <XML ID="xmlDocument"> <class> <student studentID="13429"> <name>Jane Smith</name> <GPA>3.8</GPA> </student> </class> </XML> All of the below begin with xmlDocument.documentElement.childNodes.item(0) .childNodes.item(0).text Returns "James Smith" .childNodes.item(1).text Returns "3.8" .text Returns "James Smith 3.8" i.e. name & GPA Note: Everything is case sensitive here Data Island (c) Nizar Mabroukeh 2001

More Related