1 / 33

Introduction to XML

Learn about XML being a meta-markup language derived from SGML, its anatomy, conformance, and more. Explore how XML reshapes data representation online.

guyp
Download Presentation

Introduction to XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to XML John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: 0131 551 8073 (x2073) mailto:John.Arnett@isd.csa.scot.nhs.uk http://isdscotland.org/xml

  2. Contents • What is XML? • Anatomy of an XML Document • Conformance and Validation • Summary • Find Out More

  3. What is XML? • a programming language • a software panacea • an object-oriented technology • HTML with funny tags • a replacement for HTML… but it is re-shaping publishing on the web • XML is not…

  4. What is XML? • Meta-markup language derived from SGML (Standard Generalised Markup Language) • Open Standard, currently XML 1.0 2nd edition (W3C Recommendation 6 October 2000) • Stands for Extensible Markup Language

  5. What is XML? • XML is the universal format for structured documents and data on the Web • A data object is an XML document if it is well-formed, as defined in [the W3C] specification(more on this later) • W3C says

  6. ID SURNAME FORENAME SEX DOB 134376 Jones Ian 0 06011971 198457 McKenzie Alison 1 23081983 111672 Martin Lesley 0 12111979 147678 Jackson Sarah 1 15061976 Flat file, database, spreadsheet, etc What is XML? • Data Content and Presentation Sample dataset

  7. Structured • Searchable • Easy to understand • Portable What is XML? • Record – data oriented structure 111672 Martin Lesley 0 12111979

  8. Easy to understand • Portable • Structured • Searchable What is XML? • HTML – document oriented structure Record Id: 11672 Surname: Martin Given Name: Lesley Sex: Male Date of Birth: 12 November 1979 <h1>Record Id: <font color="red">11672</font></h1> <table><colgroup><col align="left"></colgroup> <tr><th>Surname:</th><td>Martin</td> </tr><tr><th>Given Name:</th><td>Lesley</td> </tr><tr><th>Sex:</th><td>Male</td></tr> <tr><th>Date of Birth:</th><td>12 November 1979</td></tr> </table>

  9. Easy to understand • Portable • Structured • Searchable What is XML? • XML to the rescue! <Record recordId=“11672"> <Surname>Martin</Surname> <GivenName>Lesley</GivenName> <Sex>M</Sex> <DateOfBirth> <Day>12</Day><Month>11</Month><Year>1979</Year> </DateOfBirth> </Record>

  10. What is XML? • Text based • Open standards • Widely used • HTML and XML are…

  11. What is XML? • Structured • Separates data from presentation • Self-describing • Searchable • Extensible • i.e. any number of tags allowed • But XML also…

  12. Anatomy of an XML Document • character data • tab, carriage return and line feed • Unicode characters • markup • XML documents consist of text

  13. Anatomy of an XML Document • Markup <?xml version="1.0" encoding="UTF-8"?> <Message> <!-- this is an xml comment --> <MessageBody>Hello, World Wide Web!</MessageBody> </Message> • start-, end- and empty element tags • tag names are case sensitive! • entity and character references • comments

  14. Anatomy of an XML Document • Character data <?xml version="1.0" encoding="UTF-8"?> <Message> <!-- this is an xml comment --> <MessageBody>Hello, World Wide Web!</MessageBody> </Message> • Reserved characters • &, <, >,‘ and “

  15. Anatomy of an XML Document • Declaration <?xml version="1.0" encoding="UTF-8"?> <Message> <!-- this is an xml comment --> <MessageBody>Hello, World Wide Web!</MessageBody> </Message> • Optional first line of markup (but W3C recommended) • Used to match documents to parsers

  16. Anatomy of an XML Document • Root Element <?xml version="1.0" encoding="UTF-8"?> <Message> <!-- this is an xml comment --> <MessageBody>Hello, World Wide Web!</MessageBody> </Message> • Uniquely named element • Contains all the data and links to other documents

  17. Anatomy of an XML Document • Elements <Book>XML Bible <Price>24.99</Price> <img src=“book.gif"/> <Author>E.R. Harold</Author> <Publisher>J. Forbes</Publisher> </Book> • Define the content of the XML document • May contain other elements, character data or can be empty

  18. Anatomy of an XML Document • Attributes <BookCatalogSubject="XML"> <Book Title="XML Bible" Price="24.99“/> <Book Title="XML How To Program" Price=“19.99“/> <Book Title=“Definitive XML Schema“ Price=“44.99“/> </BookCatalog> • Add data about the elements

  19. Anatomy of an XML Document • Built-in entities & = &amp; “ = &quot; < = &lt; > = &gt; ‘ = &apos; • Handling reserved characters • CDATA Sections <CodeSnippet> <![CDATA[if(this->getX() < 5 && values[0] => 10) cerr << "out of range";]]> </CodeSnippet>

  20. Anatomy of an XML Document • Namespaces • Preventing naming collisions <order xmlns:cust="http://www.example.com/custDetails“ xmlns:book="http://www.example.com/bookDetails" xmlns="http://www.example.com/order"> <cust:title>Dr</cust:title> <cust:name>Peter Parker</cust:name> <book:title>White Teeth</book:title> <book:price>5.99</book:price> <orderNumber>AYT2379</orderNumber> </order>

  21. Conformance and Validation • One root element • Start and end tags match <Tag>content</Tag> • Empty elements are terminated as<Tag/> • Tags are correctly nested <Parent><Child></Child></Parent> • All attributes enclosed in “quotes” • All XML processors must check well-formedness constraints

  22. Conformance and Validation • specified in Document Type Definitions (DTDs) or Schemas • a valid XML document must be well-formed • a well-formed document need not necessarily be valid • Validating XML processors check against validity constraints

  23. Structure and order of child elements <!ELEMENT Product (Name, Size?)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Size (#PCDATA)> • Element attributes <!ATTLIST Product EffDate CDATA #IMPLIED> Document Type Definitions • DTD syntax able to specify • limited number of data types • default and fixed attribute values

  24. Document Type Definitions • Easy to understand and implement • Lightweight alternative to schemas • But… • use non-XML syntax • only limited support for data typing and namespaces • difficult to extend • DTD’s

  25. Schemas • Uses XML syntax • Provides built-in and supports user-defined data types • Supports namespaces • Provides several extensibilty mechanisms • W3C Schema

  26. Schemas • Schemas therefore more flexible… <xs:element name="Product"> <xs:complexType> <xs:sequence> <xs:element name=“Name" type="xs:string"/> <xs:element name=“Size" type="xs:positiveInteger” minOccurs="0"/> </xs:sequence> <xs:attribute name=“EffDate" type="xs:date"/> </xs:complexType> </xs:element> • but harder to understand than DTD’s <!ELEMENT Product (Name, Size?)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Size (#PCDATA)> <!ATTLIST Product EffDate CDATA #IMPLIED>

  27. In Summary… • A language for describing markup languages • Extensible, ie. define own tags • Readable, structured and self describing • Documents must be well-formed • Documents may be validated using DTD’s and/or Schemas

  28. Find Out More • World Wide Web Consortium • www.w3.org • W3C XML v1.0 Specification • http://www.w3.org/TR/REC-xml

  29. Find Out More • The XML Industry Portal • www.xml.org • O’Reilly XML site • www.xml.com • XML Cover Pages • www.oasis-open.org/cover/ • Café Con Leche • www.ibiblio.org/xml/

  30. Find Out More • Scottish Health and Community Care XML Steering Group • www.isdscotland.org/xml

  31. XML Tools • XSV - Open Source XML Schema Validator • www.ltg.ed.ac.uk/~ht/xsv-status.html • MSXML 4.0 • www.microsoft.com/downloads/details.aspx?FamilyID=3144b72b-b4f2-46da-b4b6-c5d7485f2b42

  32. XML Tools • XML Spy 2004 IDE • www.altova.com/products_ide.html • Free XML Tools and Software • www.garshol.priv.no/download/xmltools/

  33. Printed Sources • Numerous printed sources – for more information visit • Charles F. Goldfarb'swww.xmlbooks.com • www.amazon.com

More Related