190 likes | 384 Views
XPath. Laks V.S. Lakshmanan UBC CPSC 534B. Overview. data model recap XPath examples some advanced features summary. XPath in the beginning. used to be part of XSLT A formal semantics of patterns in XSLT (Phil Wadler) also influenced XLink, XPointer, resources:
E N D
XPath Laks V.S. Lakshmanan UBC CPSC 534B
Overview • data model recap • XPath examples • some advanced features • summary
XPath in the beginning • used to be part of XSLT • A formal semantics of patterns in XSLT (Phil Wadler) • also influenced XLink, XPointer, • resources: • www.w3c.org/TR/xpath • Galax (complete reference impl. of XQuery) http://db.bell-labs.com/galax/ • (w3c.org – major resource many XML and other web related stuff, incl. XQuery, semantic web, etc.)
Example XML DB <bib> <book price=15> <title> What is the name of this book?</title> <author nationality=american> <first> Raymond</first> <last>Smullyan</last> </author> <publisher>Penguin</publisher> <year>1970</year> </book> <book><publisher><name>Bentam Books</name><address>New York</address></publisher> <author><first>Douglas</first><mi>R</mi><last>Hofstadter</last></author> <author>D.C. Dennett</author> <title>The Mind’s I: Reflections on Self and Soul</title> <year>1981</year> </book> </bib>
Corresponding Tree root note distinction between the two roots. comments root doc. element bib processing instructions unordered usually, single valued. Exception: IDREFS. book book attribute price ooo ooo title author publisher ordered. no apriori car- dinality constraint. element
Simple Examples • /bib/book/publisher • answer: <publisher>Penguin</publisher> <publisher><name>Bentam Books</name> <address>New York</address></publisher> • /bib/book/author/name • what’s the answer? • / -- returns root element, while • /bib -- returns doc. root element, i.e., the bib element under the root.
Descendants • /bib/book//address • answer: <address>New York</address></publisher> • /bib/book//mi • answer: <mi>R</mi> • //title • answer: <title> What is the name of this book?</title> <title>The Mind’s I: Reflections on Self and Soul</title> Note: results ordered as per i/p document order.
Wildcard • //author/* • answer: <first> Raymond</first> <last>Smullyan</last> <first>Douglas</first><mi>R</mi><last>Hofstadter</last> why only two authors(’ info.) returned? Note: * matches any element. • what does //* return? • is the answer identical to that for /bib?
Attributes • XML data model – diff. kinds of nodes: element, attribute, text, comment, processing instruction, ... • /bib/book/@price • answer: ``15” contrast with answer for previous queries. • /bib/book/@* what do you think it should return?
Branching & Qualifiers/Predicates • /bib/book/author[mi] • returns only second book. • /bib/book[author/@nationality=american] • returns only first book. • /bib/book[publisher[address][name]][price<20]//title • returns the titles of books with a publisher who has a name & an address and with a price < 20.
Reaching out at other nodes • XPath has the functions text(), node(), name(). Meanings illustrated below. • /bib/book/publisher/text() • answer: Penguin • why first pub doesn’t appear? • /bib/book/node() • returns all child nodes of book, regardless of type (attr, text, element). • /bib/*/name() – returns tag of current element.
Mixing it all • /bib/book[author[hobby=tennis]][title/text()]//year • what does it say? • Features of XPath seen so far tree pattern query. $x distinguished node $y $z $x.tag=bib & $y.tag=comment & $z.tag=publisher ... $z $w
XPath – Summary • / -- matches the root. • /bib – matches bib element under root. • bib – matches any bib element. • * -- matches any element. • bib/book – matches any book element under a bib element. • bib//book – ditto, but at any depth. • //book – matches any book element at any depth in the document. • author|editor – matches any author or editor element. • @hobby – matches any hobby attribute. • //author/@hobby -- matches any price attribute of an author at any depth of the doc. • /bib/book[author[@hobby]][@price<20]//publisher – what does it match?
XPath – The 13 axes • child • descendant • attribute • descendant-or-self • following • following-sibling • ancestor • ancestor-or-self • parent • preceding • preceding-sibling • self • namespace
Some Abbreviations • child::book/child::author book/author • child::book/descendant::mi book//mi • child::first/parent::* first/.. • child::book/attribute::price book/@price • child::book/child::author/parent::*/child::year book[author]/year • /bib//mi[ancestor::book] ? • /bib//mi/ancestor::book//publisher ? • /bib//mi/ancestor::*//publisher ?
More examples • /bib/descendant::*[name()=address] /bib//address • /bib//book//first/parent::*[name()=author] • /bib//book//mi[ancestor:*[name()=author or name()=editor]] • navigation axes increase expressive power • BUT, when schema is known, can often simplify XPath expressions
Simplifying XPE with schema bib example schema graph S • /bib//book//first/parent::*[name()=author] /bib//book//author[first] • /bib//book//mi[ancestor:*[name()=author or name()=editor] S /bib//book//*[name()=author or name()]//mi * book * ? + 1 ? editor author title publisher year 1 ? 1 1 ? first mi last name address
XPath, formally speaking • XPE a binary relation over document nodes: p(context node, answer node). • basic cases: • “.” is, i.e., self (x,x) • “..” is parent, i.e., (current node, its parent) • publisher/address is (current node, address node reachable from current node via publisher child) • book/*/mi/../name() book/*[mi]/name() what is the relationship captured by this XPE?
XPath, formally speaking • Relative path expressions – every XPE E we have seen so far, except E may be used as a predicate or as an extension to an absolute XPE. • Absolute XPE – how you get a (unary) query out of an XPE. • E.g.: author/mi and publisher/address are relative XPEs. //author/mi, /bib/book[author/mi], /bib/book[author/mi][publisher/address]//year are all absolute XPEs. • More details: see resources and stay tuned for homework.