1 / 81

XML- Extensible Markup Language

XML- Extensible Markup Language. HTML to XML. HTML documents Emerging Web Standards - XML XML good for data interchange across platforms enterprise wide conversion HTML to XML - IBM, Microsoft. XML - Motivation.

nola
Download Presentation

XML- Extensible Markup Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML- Extensible Markup Language

  2. HTML to XML • HTML documents • Emerging Web Standards - XML • XML good for data interchange across platforms enterprise wide • conversion HTML to XML - IBM, Microsoft

  3. XML - Motivation • In HTML, both the tag semantics and tags are fixed. There is limited and strict interpretation of tags. • HTML is widely successful in disseminating documents across internet. • Though data can be disseminated through HTML, its extraction is painful, and laborious. • EDI has been a predominate mode of exchanging data among businesses. But it has very rigid format that requires highly customized applications.

  4. XML - Introduction • XML aims to provide ease of authoring HTML documents with ease of data exchange that is possible with EDI. • Tags are used to markup documents. • XML is a meta-language for describing markup languages. • XML provides a facility to define tags and structural relationships between them. • No pre-defined tag set implied no preconceived semantics, semantics of XML document is defined by applications that process them

  5. XML - Goals • Straightforward to use over internet • Support wide variety of applications, authoring, browsing, content analysis, etc. • Easy to write programs that process XML documents and validate them. • XML documents must be human-legible and reasonably clear. • Design of XML shall be formal and concise - expressed as EBNF (extended Backus Naur Form) - amenable to modern compiler tools and techniques.

  6. XML-features • Some structure - not rigid • Extensibility - User defined tags • nested elements • validation - documents may specify their own grammar • DTD (Document Type Descriptor) - schema exists with data as tag names • Application -EDI - extraction, conversion, , transformation, integration • can be modeled using DOM

  7. More terminology • RDF - Resource Description Framework - a method to describe metdata for XML documents • XSL - Extensible Stylesheet Language - language for transforming and formatting XML. • Transformation Language - XSLT, XPath, Xpointer, Xlink

  8. Example-HTML • Print - Sanjay Madria Web Warehouse Tutorial, ADBIS’99 HTML <H2> Sanjay Madria </H2> <I> Web Warehouse Tutorial, ADBIS’99</I> Very difficult to understand, structure is hidden, describes only appearance

  9. XML • <Ref> <Speaker> <Firstname> Sanjay</firstname> <Lastname> Madria</lastnaame> </Speaker> <Title > Web Warehouse Tutorial</Title> <Conference> ADBIS’99</Conference> </empty> </Ref> another format: <Firstname Value “Sanjay”/>

  10. XML can Separate Data from HTML • XML is used to Exchange Data • XML can be used to Share Data • XML can be used to Store Data • XML can be used to Create new Languages (WML)

  11. XML • <Person> - a start-tag • </Person> - a end tag • Tags are also called markups. • Tags must be balanced; close in inverse order of their opening • Tags are defined by users, no predefined tags

  12. <person> <name> Alan </name> <age> 42 </age> <email> agb@abc.com </ email > </person> Element - <Person>…..</Person> Subelement – Age

  13. XML elements must follow these naming rules: • Names can contain letters, numbers, and other characters • Names must not start with a number or "_" (underscore) • Names must not start with the letters xml (or XML or Xml ..) • Names can not contain spaces

  14. <table> <description> People on the fourth floor </description> <people> <person> <name> Alan </name> <age> 42 </age> <email> agb@abc.com </ email > </person> <person> <name> Patsy </name> <age> 36 </age> <email> ptn@abc.com </ email > </person> <person> <name> Ryan </name> <age> 58 </age> <email> rgz@abc.com </ email > </person> </people> </table>

  15. <married></married> Can be abbreviated to <married/>

  16. XML Attributes Att. (Name, value) pair <product> <name language=“French”> trompette six trous </name> <price currency=“Euro”> 420.12 </price> <address format=“XLB56” language=“French”> <street>31 rue Croix-Bosset</ street> <zip>92310</zip><city>Sevres</city> <country>France</country> </address> </product>

  17. Attributes takes always string values (“..”) • A given attribute may occur only once within a tag, while subelements within same tag can repeat attributes

  18. XML tags are case sensitive • With XML, White Space is Preserved • <b><i>This text is bold and italic</b></i> • Ok in HTML • <b><i>This text is bold and italic</i></b>

  19. XML Elements are Extensible • Extract to • MESSAGETo: ToveFrom: Jani • Don't forget me this weekend!

  20. <?xml version="1.0" ?>-<note><to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

  21. <note> <date>1999-08-01</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> • No problem

  22. Book Title: My First XML • Chapter 1: Introduction to XML • What is HTML • What is XML • Chapter 2: XML Syntax • Elements must have a closing tag • Elements must be correctly nested

  23. <book> • <title>My First XML</title> • <prod id="33-657" media="paper"></prod> • <chapter>Introduction to XML • <para>What is HTML</para> • <para>What is XML</para> • </chapter> • <chapter>XML Syntax <para>Elements must have a closing tag</para> <para>Elements must be properly nested</para> </chapter> • </book>

  24. <person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> • <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>

  25. Bad Design • <note day="12" month="11" year="99" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note>

  26. <note date="12/11/99"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

  27. <note> <date>12/11/99</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

  28. <note> <date> <day>12</day> <month>11</month> <year>99</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

  29. PCDATA • XML parsers treat all text as Parsable Characters (PCDATA). • When an XML element is parsed, the text between the XML tags is also parsed: • CDATA • Everything inside a CDATA section is ignored by the parser. • Starts with "<![CDATA[" and ends with "]]>":

  30. <person> <name> Alan </name> <age> 42 </age> <email> agb@abc.com </ email > </person> or <person name=“Alan” age = “42” email = “agb@abc.com” /> or <person age = “42” > <name> Alan </name> <email> agb@abc.com </ email > </person>

  31. person person email name age name email age Alan 42 agb@abc.com Alan agb@abc.com 42

  32. XML can associates unique identifier to elements, as the value of certain attribute Called id • Refer that element using idref

  33. <messages> • <note ID="501"> • <to>Tove</to> • <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> • </note> • <note ID="502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not!</body> </note> • </messages>

  34. <state id=“s2”> <scode>NE</scode> <sname>Nevada</sname> </state> <city id=“c2”> <ccode>CCN</ccode> <cname>Carson City</cname> <state-of idref = “s2”/> </city>

  35. a a c b

  36. <a><b id=“&o123”> some string </b></a> <a c=“&o123”/> Assume c as reference attribute <a b=“&o123”/> <a><c id=“&o123”> some string </b></a> Assume b as reference attribute

  37. <geography> <states> <state id=“s1”> <scode>ID</scode> <sname>Idaho</sname> <capital idref=“c1”/> <cities-in idref=“c1”/><cities-in idref=“c3”/>…… </state> <state id=“s2”> <scode>NE</scode> <sname>Nevada</sname> <capital idref=“c2”/> <cities-in idref=“c2”/>……. </state> …. </states>

  38. <cities> <city id=“c1”> <ccode>BOI</ccode> <cname>Boise</cname> <state-of idref = “s1”/> </city> <city id=“c2”> <ccode>CCN</ccode> <cname>Carson City</cname> <state-of idref = “s2”/> </city> <city id=“c3”> <ccode>MOC</ccode> <cname>Moscow</cname> <state-of idref = “s1”/> </city> … </cities> </geography>

  39. Ordering person:{firstname: “John”, lastname:“Smith”} person:{lastname: “Smith”,firstname: “John”} As SSD, both are same

  40. These two are not same as XML documents <person><firstname>John</firstname> <lastname>Smith </lastname></person> <person><lastname>Smith </lastname> <firstname>John</firstname></person> The following two are equivalent as attributes are not ordered <person firstname=“John”lastname=“Smith”/> <person lastname=“Smith” firstname=“John”/>

  41. Mixing elements and Text <Person> This is my best friend <Name> Alan </Name> <Age> 42 </Age> I am not too sure of the following email <Email> agb@abc.com </Email > </Person>

  42. <!- - this is a comment - -> - Comments are allowed anywhere except inside markup and is a part of the document. <?xml-stylesheet href=“book.css” type=“text/css”?> - Processing instructions for applications <?xml version=“1.0”?> This is not PI, not passed to application. <![CDATA[<start>this is an incorrect element </end>]]> <!DOCTYPE name [markupdeclarations]> <?xml….?> <!DOCTYPE name [markupdeclarations]> <name>…</name>

  43. <db><person> <name> Alan </name> <age> 42 </age> <email> agb@abc.com </ email > </person> <person>… </person> … </db> <!DOCTYPE db [ <!ELEMENT db (person*)> <!ELEMENT person (name,age,email)> <!ELEMENT name (#PCDATA)> <!ELEMENT age (#PCDATA)> <!ELEMENT email (#PCDATA)> ]>

  44. Recursion <!ELEMENT node (leaf | (node,node))> <!ELEMENT leaf (#PCDATA)> An example of such XML document is <node> <node> <node> <leaf> 1 </leaf> </node> <node> <leaf> 2 </leaf> </node> </node> <node> <leaf> 3 </leaf> </node> </node>

  45. <db> <r1><a> a1 </a><b> b1 </b><c> c1 </c></r1> <r1><a> a2 </a><b> b2 </b><c> c2 </c></r1> <r2><c> c2 </c><d> d2 </d></r2> <r2><c> c3 </c><d> d3 </d></r2> <r2><c> c4 </c><d> d4 </d></r2> <db>

  46. <!DOCTYPE db [ <!ELEMENT db (r1*,r2*)> <!ELEMENT r1 (a,b,c)> <!ELEMENT r2 (c,d)> <!ELEMENT a (#PCDATA)> <!ELEMENT b (#PCDATA)> <!ELEMENT c (#PCDATA)> <!ELEMENT d (#PCDATA)> ]>

  47. <!ELEMENT r2 ((c,d) | (d,c))> <!ELEMENT db ((r1|r2)*)> <!ELEMENT r1 (a,b?,c+)> <!DOCTYPE db [<!ELEMENT …>…]> <!DOCTYPE db SYSTEM “schema.dtd”> <!DOCTYPE db SYSTEM “http://www.schemaauthority.com/schema.dtd”>

  48. <product> <name language=“French” department = “music”> trompette six trous </name> <price currency=“Euro”> 420.12 </price> </product> <!ATTLIS name language CDATA #REQUIRED department CDATA #IMPLIED> <!ATTLIS price currency CDATA #IMPLIED>

  49. IDREF – attribute’s value is some other element’s identifier iDREFS – attribute’s value is a list of identifiers, separated by spaces <!DOCTYPE family [ <!ELEMENT family (person*)> <!ELEMENT person (name)> <!ELEMENT name (#PCDATA)> <!ATTLIS person id ID #REQUIRED mother IDREF #IMPLIED father IDREF #IMPLIED children IDREFS #IMPLIED> ]>

  50. <family> <person id=“jane” mother=“mary” father=“john”> <name> Jane Doe </name> </person> <person id=“john” children =“jane jack” > <name> John Doe </name> </person> <person id=“mary” children =“jane jack” > <name> Mary Smith </name> </person> <person id=“jack” mother=“smith” father=“john”> <name> Jack Smith </name> </person> </family>

More Related