310 likes | 319 Views
Learn the basics of XPath query language and how to use it for querying XML documents. This tutorial covers XPath expressions, functions, qualifiers, and navigation axes.
E N D
CSE 636Data Integration XML Query Languages XPath
XPath • http://www.w3.org/TR/xpath (11/99) • Building block for other W3C standards: • XSL Transformations (XSLT) • XML Link (XLink) • XML Pointer (XPointer) • XQuery • Was originally part of XSL
Example for XPath Queries <bib> <book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <title> Foundations of Databases </title> <year> 1995 </year> </book> <bookprice=“55”> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year> </book> </bib>
Data Model for XPath / Document The root XML PI Comment Element bib The root element Element book Element book … Element publisher Element author … Text Addison-Wesley Text Serge Abiteboul
XPath: Simple Expressions /bib/book/year Result: <year> 1995 </year> <year> 1998 </year> /bib/paper/year Result: empty (there were no papers)
XPath: Restricted Kleene Closure //author Result: <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <author> Jeffrey D. Ullman </author> /bib//first-name Result: <first-name> Rick </first-name>
XPath: Functions /bib/book/author/text() Result: Serge Abiteboul Jeffrey D. Ullman Rick Hull doesn’t appear because he has firstname, lastname Functions in XPath: • text() = matches the text value • node() = matches any node (= * or @* or text()) • name()= returns the name of the current tag
XPath: Wildcard //author/* Result: <first-name> Rick </first-name> <last-name> Hull </last-name> * Matches any element
XPath: Attribute Nodes /bib/book/@price Result: “55” @price means that price is has to be an attribute
XPath: Qualifiers /bib/book/author[first-name] Result:<author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author>
XPath: More Qualifiers /bib/book/author[firstname][address[//zip][city]]/lastname Result: <lastname> … </lastname> <lastname> … </lastname>
XPath: More Qualifiers /bib/book[@price < “60”] /bib/book[author/@age < “25”] /bib/book[author/text()]
XPath: Summary bib matches a bib element * matches any element / matches the root element /bib matches a bib element under root bib/paper matches a paper in bib bib//paper matches a paper in bib, at any depth //paper matches a paper at any depth paper|book matches a paper or a book @price matches a price attribute bib/book/@price matches price attribute in book, in bib bib/book[@price<“55”]/author/lastname matches…
XPath: More Details • An XPath expression, p, establishes a relation between: • A context node, and • A node in the answer set • In other words, p denotes a function: • S[p] : Nodes {Nodes} • Examples: • author/firstname • . = self • .. = parent • part/*/*/subpart/../name = part/*/*[subpart]/name
The Root and the Root <bib> <paper> 1 </paper> <paper> 2 </paper> </bib> • bib is the “document element” • The “root” is above bib • /bib = returns the document element • / = returns the root • Why? • Because we may have comments before and after <bib> • They become siblings of <bib>
XPath: More Details • We can navigate along 13 axes: ancestor ancestor-or-self parent attribute child descendant-or-self descendant following following-sibling namespace preceding preceding-sibling self We’ve only seen these, so far
XPath: More Details • Examples: • child::author/child:lastname = author/lastname • child::author/descendant-or-self::node()/child::zip = author//zip • child::author/parent::* = author/.. • child::author/attribute::age = author/@age • What does this mean ? • /bib/book/publisher/parent::*/author • /bib//address[ancestor::book] • /bib//author/ancestor::*//zip
XPath: Even More Details • name() = the name of the current node • /bib//*[name()=book] same as /bib//book • What does this mean? /bib//*[ancestor::*[name()!=book]] • Is it equivalent to the following? • /bib//* • /bib//*[name()!=book]//* • Navigation axis gives us strictly more power!
XPath: Example How do we evaluate this XPath expression?/bib//*[name()!=book]//* Let’s take it one step at a time bib A B book C D
XPath: Example /bib returns the following list of one node: bib A B book C D
XPath: Example /bib//* when executed on the previous node list, returns the following new list of nodes: A B book C D book C D C D
XPath: Example /bib//*[name()!=book] when executed on the previous node list, it eliminates one node: A B book C D C D
XPath: Example /bib//*[name()!=book]//* gives us the resulting node list of the XPath expression: book C D C D
Keys in XML Schema • We forgot something about XML Schema • Keys • Key References • Why? • XPath is used for keys and key references
Keys in XML Schema XML: <purchaseReport> <regions> <zipcode="95819"> <partnumber="872-AA" quantity="1"/> <partnumber="926-AA" quantity="1"/> <partnumber="833-AA" quantity="1"/> <partnumber="455-BX" quantity="1"/> </zip> <zip code="63143"> <partnumber="455-BX" quantity="4"/> </zip> </regions> <parts> <partnumber="872-AA">Lawnmower</part> <partnumber="926-AA">Baby Monitor</part> <partnumber="833-AA">Lapis Necklace</part> <partnumber="455-BX">Sturdy Shelves</part> </parts> </purchaseReport> XML Schema: <keyname="NumKey"> <selectorxpath="parts/part"/> <fieldxpath="@number"/> </key>
Keys in XML Schema XML Schema: <xs:elementname="purchaseReport"> <xs:complexType> <xs:sequence> <xs:element name="regions"> … </xs:element> <xs:element name="parts"> … </xs:element> </xs:sequence> </xs:complexType> <xs:key name="numKey"> <xs:selector xpath="parts/part" /> <xs:field xpath="@number" /> </xs:key> <keyref name="numKeyRef" refer="numKey"> <selector xpath="regions/zip/part" /> <field xpath="@number" /> </keyref> </xs:element>
Keys in XML Schema • In general, two flavors: Note • All XPath expressions “start” at the element currently being defined • The fields must identify a single node <keyname=“someNameHere"> <selectorxpath=“p"/> <fieldxpath=“p1"/> <fieldxpath=“p2"/> … <fieldxpath=“pk"/> </key> <uniquename=“someNameHere"> <selectorxpath=“p"/> <fieldxpath=“p1"/> <fieldxpath=“p2"/> … <fieldxpath=“pk"/> </key>
Keys in XML Schema • Unique = guarantees uniqueness • Key = guarantees uniqueness and existence • All XPath expressions are “restricted”: • /a/b | /a/c OK for selector • //a/b/*/c OK for field • Note: better than DTD’s ID mechanism
Keys in XML Schema • Examples • <keyname="fullName"> • <selectorxpath=".//person"/> • <fieldxpath="forename"/> • <fieldxpath="surname"/> • </key> • <uniquename="nearlyID"> • <selectorxpath=".//*"/> • <fieldxpath="@id"/> • </unique> Recall: must have a single forename, surname
Foreign Keys in XML Schema • Examples • <keyrefname="personRef" refer="fullName"> • <selectorxpath=".//personPointer"/> • <fieldxpath="@first"/> • <fieldxpath="@last"/> • </keyref>
References • Lecture Slides • Dan Suciu • http://www.cs.washington.edu/homes/suciu/COURSES/590DS/06xpath.htm • http://www.cs.washington.edu/homes/suciu/COURSES/590DS/14constraintkeys.htm • BRICS XML Tutorial • A. Moeller, M. Schwartzbach • http://www.brics.dk/~amoeller/XML/index.html • W3C's XPath homepage • http://www.w3.org/TR/xpath • W3C's XML Schema homepage • http://www.w3.org/XML/Schema • XML School • http://www.w3schools.com