320 likes | 426 Views
Database & Information Systems. XML Motivation & Syntax Monica Farrow G30 email : monica@macs.hw.ac.uk. XML Topics. Motivation Syntax Describing the document DTD, XML Schema Accessing the elements using XPath Using XML Transforming and querying XML XSLT, XPath, XQuery
E N D
Database & Information Systems XML Motivation & Syntax Monica Farrow G30 email : monica@macs.hw.ac.uk
XML Topics • Motivation • Syntax • Describing the document • DTD, XML Schema • Accessing the elements using XPath • Using XML • Transforming and querying XML • XSLT, XPath, XQuery • XML & Databases • Programming APIs (DOM, SAX) used with XML XML - Motivation & Syntax
XML in One Slide • Basically, XML is an annotated text file. The format is similar to HTML • However, in XML, you can use any tag names that you want, to describe the data • Example: <person> <name> Lisa Simpson</name> <tel> 0131-828-1234 </tel> <tel> 078-4701-7775 </tel> <email> lisa@macs.hw.ac.uk</email> </person> XML - Motivation & Syntax
Motivation • XML allows us to create machine-readable text files, enabling • Exchange of data over a network • Separation of content from presentation • “Write once read anywhere” • The Semantic Web • A machine-understandable Web • The meaning of data (i.e., the semantics of data) should be encoded together with the data XML - Motivation & Syntax
Newsfeeds • News can be exported as RSS - this data can easily be used by a program • Browsers such as Firefox enable you to add rss feeds to your webpage XML - Motivation & Syntax
RSS example • Really Simple Syndication • The latest news on topics you’ve subscribed to arrive at your RSS reader (here the browser) XML - Motivation & Syntax
Business data exchange Solution: Use XML On every step of the way for data exchange XML - Motivation & Syntax
Application data • A standard method to access information, making it easier for applications and devices of all kinds to use, store, transmit, and display data. • For example, an application may store data in XML files to keep track of the updates used • Version number, file names, installation time etc XML - Motivation & Syntax
XSL XSL XSL WML (hand-held devices) HTML (web browser) TEXT (Excel) Write Once Use Everywhere XML document XML - Motivation & Syntax
Insurance Co. Rating Provider sites Physician’s Agent Mom required treatment in-plan? close-by? Specialist? Driving schedule Lucy’s Agent Pete’s Agent Semantic integration: Doctor’s Appointment“The Semantic Web”, Scientific American, May 2001 Needs treatment Schedule appointment Arranges treatment Will drive her there if free Will drive her there if free XML - Motivation & Syntax
Some existing XML languages • XHTML • XML compatible version of HTML • DocBook • For any documentation. Tags such as title, chapter, para etc • ODF • Open document format. For office documents such as word processing or spreadsheets . Used by OpenOffice. • MathXML • To describe mathematical formulae XML - Motivation & Syntax
XML Syntax
XML Overview • XML is a ‘human-legible’ simplified subset of the Standardised General Markup Language, on which HTML is also based • Data is divided into elements and attributes. Each element is surrounded by a start tag and an end tag. • <tel>0131–444 7777</tel> • Tagnames are chosen to reflect the meaning of the element content • (In html, tagnames are chosen to indicate page structure) SGML XML HTML XML - Motivation & Syntax
element, Contains text Terminology • The segment of an XML document between an opening and a corresponding closing tag is called an element • Elements may contain text or other elements Element contains other elements <person> <name>Bart Simpson</name> <tel>0131–444 7777</tel> <tel>078–4011 6022</tel> <email>bart@ed.ac.uk</email> </person> Can be >1 element with the same tagname XML - Motivation & Syntax
person name tel tel email XML Document is a Tree Bart Simpson 0131-444 7777 078–4011 6022 bart@ed.ac.uk • XML documents are abstractly modeled as trees, as reflected by their nesting • Sometimes, XML documents are graphs (by using IDs and IDREFs to link elements) XML - Motivation & Syntax
Elements Can Be Nested <addresses> <person> <name>Donald Duck</name> <tel>0131-8281345</tel> <tel>0131-8281374</tel> <email> donald@macs.hw.ac.uk </email> </person> <person> <name> Mickey Mouse</name> <tel> 0141-4261142 </tel> </person> </addresses> XML - Motivation & Syntax
A Complete XML Document <?XML version ="1.0" encoding="UTF-8"> <!DOCTYPE addresses SYSTEM "http://www.addbook.com/addresses.dtd"> <addresses> <person> <name>Lisa Simpson</name> <tel> 0131-828 1234 </tel> <tel> 078-4701 7775 </tel> <email> lisa@macs.hw.ac.uk </email> </person> </addresses> Required Optional XML - Motivation & Syntax
Attributes • An opening tag may contain attributes • These are typically used to describe the contents of an element <entry> <wordlanguage = “en”>cheese</word> <wordlanguage = “fr”>fromage</word> <wordlanguage = “ro”>branza</word> <meaning>A food made …</meaning> </entry> XML - Motivation & Syntax
When to Use Attributes It’s not always clear when to useattributes <person> <ssno>123 4589</ssno> <name>L. Simpson </name> <email> lisa@macs.hw.ac.uk </email> ... </person> <person ssno= “123 4589”> <name>L. Simpson </name> <email> lisa@macs.hw.ac.uk </email> ... </person> XML - Motivation & Syntax
When to Use Attributes It’s not always clear when to use attributes General Rule: Use an attribute to describe how the data should be interpreted (e.g. language, currency) Use an attribute for “IDs”, i.e., identifying data (covered later) XML - Motivation & Syntax
Rules for XML (1) • XML is order sensitive, i.e. the following are different: • XML is case-sensitive, i.e., the following are different: <person>, <Person>, <PERSON> <entry> <wordlanguage = “en”>cheese</word> <wordlanguage = “fr”>fromage</word> </entry> <entry> <wordlanguage = “fr”>fromage</word> <wordlanguage = “en”>cheese</word> </entry> XML - Motivation & Syntax
Rules for XML (2) • Tags come in pairs<date> ...</date> • They must be properly nested • Good:<date> ... <day> ... </day> ... </date> • Bad: <date> ... <day> ... </date>... </day> • Bad: <date> ... </Date> • There is a special shortcut for tags that have no text or sub-elements in between them (empty element, bachelor tags) • <img src=“myPic.jpg” /> instead of • < img src=“myPic.jpg > </img> XML - Motivation & Syntax
Rules for XML (3) • There should be exactly one top-level element. • This element is also called the root element • <?xml version=“1.0”?> • <Question> This is legal </Question> • <?xml version=“1.0”?> • <Question> Is this legal? </Question> • <Answer> No. </Answer> XML - Motivation & Syntax
Well Formed Documents • A document is well-formed if it has • One top-level element • Tags come in properly nested case-sensitive pairs • Empty elements may use the accepted shortcut / • Attribute values must be enclosed in quotes • Attribute names must not be repeated within a tag XML - Motivation & Syntax
Why is this not well-formed? <?XML version ="1.0" encoding="UTF-8"> <person phone= 0131-828 1234 phone=078-4701 7775 > <Name> <first>Homer <second>Simpson </first></second> </name> <person phone= 0131-828 1235 > <Name> <first>Lisa <second>Simpson </first></second> </name> XML - Motivation & Syntax
IDs and Referencing • Unique elements can be identified with an id, and referred to from other elements • In this way, relationships between elements can be shown without repetition • E.g. • Each person has an ID. Each person can contain a reference to the ID of their mother, father, children • Books and authors can be listed. But each book may have >1 author, each author might write >1 book. So the book can contain a reference to the author. etc XML - Motivation & Syntax
Referencing example <family> <person id=“lisa” mother=“marge” father=“homer”> <name> Lisa Simpson </name> </person> <person id=“bart” mother=“marge” father=“homer”> <name> Bart Simpson </name> </person> <person id=“marge” children=“bart lisa”> <name> Marge Simpson </name> </person> <person id=“homer” children=“bart lisa”> <name> Homer Simpson </name> </person> </family> XML - Motivation & Syntax
XML Authoring • There are many authoring tools available to facilitate the creation of XML documents. • E.g., XML Spy, Xmetal • However, you may as well start off using a simple text editor, ideally XML aware • XML is after all just a text file. • You are then responsible for checking that the XML is correct! XML - Motivation & Syntax
Viewing and checking XML • This is perhaps simplest way to check that XML is well formed: • If well formed XML is loaded into your browser it will be displayed as a tree structure XML - Motivation & Syntax
Viewing and checking XML • If incorrect XML is loaded into your browser then error messages will be displayed XML - Motivation & Syntax
Defining the structure of an XML file • We can check if an XML file is well-formed • by looking at it, maybe • By loading it into a browser • If well-formed, it will be displayed • However, how can we check that the well-formed file contains the correct elements in the correct quantities? • We need to write a specification for the XML file • See the next lecture XML - Motivation & Syntax
Exercise • Write an example of an XML file containing 2 or 3 records which holds information about holiday homes for rent. • Each home has an id, a name and a location • Additionally, each home has one or more sets of contact details. Contact details consist of a name and a phone number, and optionally an email address and website. • In your example, demonstrate optional or repeated elements. XML - Motivation & Syntax