1 / 30

XPath

XPath. XML Path Language. Outline. XML Path Language (XPath) Data Model Description Node values XPath expressions Relative expressions Simple subset of XPath Predicates Node-Set Functions Full location Steps Axes Node Tests Abbreviated syntax Links to more information .

zita
Download Presentation

XPath

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XPath XML Path Language

  2. Outline • XML Path Language (XPath) • Data Model Description • Node values • XPath expressions • Relative expressions • Simple subset of XPath • Predicates • Node-Set Functions • Full location Steps • Axes • Node Tests • Abbreviated syntax • Links to more information

  3. XML Path Language (XPath) • XPath 1.0 is a W3C Recommendation (16 November 1999) • used for addressing elements (in XPointer) • used for matching elements (in XSLT and XQuery) • declarative language for specifying paths in trees • syntax somewhat similar to that used for paths in file hierarchies

  4. Data Model - example document • document is viewed as a tree of nodes • e.g. document <Book> <chapter> <heading>The First Chapter</heading> <section>... ...</section> </chapter> <chapter> <heading>The Second Chapter</heading> <section>... ...</section> <section>... ...</section> </chapter> </Book>

  5. Data Model - example document

  6. Data Model Description • 6 types of node: • root, element, attribute, text, comment, processing instruction • root of tree is different from (and parent of) root element of the document (Book in example) • in example • root node is red • element nodes are yellow • text nodes are green • element nodes have associated set of attribute nodes • attribute nodes are not children of element nodes • order of child element nodes is significant

  7. Data Model - example document • More complex document <CD publisher="Deutsche Grammophon" length="PT1H13M37S" > <composer>Johannes Brahms</composer> <performance> <composition>Piano Concerto No. 2</composition> <soloist>Emil Gilels</soloist> <orchestra>Berlin Philharmonic</orchestra> <conductor>Eugen Jochum</conductor> </performance> <performance> <composition>Fantasias Op. 116</composition> <soloist>Emil Gilels</soloist> </performance> </CD>

  8. Data Model - example document

  9. Node values • each attribute and element node has a value • e.g., value of length attribute node is PT1H13M37S • value of an element node is concatenation of all text node descendants • e.g., value of composer element node is Johannes Brahms • e.g., value of second performance element node is Fantasias Op. 116 Emil Gilels • e.g., value of CD element node is Johannes Brahms Piano Concerto No. 2 Emil Gilels Berlin Philharmonic Eugen Jochum Fantasias Op. 116 Emil Gilels • note : attribute values are not included

  10. XPath expressions • an XPath expression is either • an absolute expression or • a relative expression • an absolute expression • starts with '/' • is followed by a relative expression • and is evaluated starting at the root node • a relative expression is • a sequence of location steps • each separated by '/' • example (absolute expression comprising 2 steps): /CD/composer

  11. Relative expressions • relative expression is evaluated with respect to an initial context (set of nodes) • initial context is defined externally (by XPointer or XSLT) <xsl:template match="CD"> <xsl:value-of select="composer"/> </xsl:template> context for composer given by CD • each location step • is evaluated with respect to some context • produces a set of nodes which • provides the context for the next location step

  12. Simple subset of XPath • subset uses abbreviated syntax • a location step has one of 3 forms: • it is empty, i.e., // • element-namepredicates • @attribute-name predicates • an empty step means search all descendants of the context node • element-name means find all child elements of the context node which have the given name • @attribute-name means find the attribute node of the context node which has the given name • optional predicates (each enclosed in [ and ]) filter nodes found further

  13. Examples – cd.xml <?xml version="1.0" ?> <CDlist> <CD> <composer>Johannes Brahms</composer> <soloist>Emil Gilels</soloist> <orchestra>Berlin Philharmonic</orchestra> <conductor>Eugen Jochum</conductor> <date>1972</date> <performance> <composition>Piano Concerto No. 1</composition> </performance> <publisher>Deutsche Grammophon</publisher> <number>419159-2</number> </CD> <CD> <soloist>Martha Argerich</soloist> <orchestra>London Symphony Orchestra</orchestra> <conductor>Claudio Abbado</conductor> <date>1968</date> <performance> <composer>Frederic Chopin</composer> <composition>Piano Concerto No. 1</composition> </performance> <performance> <composer>Franz Liszt</composer> <composition>Piano Concerto No. 1</composition> <conductor>Antal Dorati</conductor> <date>1984</date> </performance> <publisher>Deutsche Grammophon</publisher> <number>449719-2</number> </CD> </CDlist>

  14. Examples – cd.xml • /CDlist/CD • all child CD elements of the CDlist element that is the child of the root • //composer • all composer elements that are descendants of the root • //performance/composer • all composer child elements of performance elements which are descendants of the root • //performance[composer] • all performance elements that have a composer element as a child • //CD[performance/date] • all CD elements that have a performance element as a child that has a date element as a child • //performance[conductor][date] • all performance elements that have both conductor and date elements as children

  15. Predicates • predicates filter out more nodes from a node-set S • evaluate predicate on each node x in node-set S with • x as the context node • the size of S as the context size • the position of x in S as the context position • predicate comprises • Boolean expressions: using and, or, not, =, ... • numerical expressions: using +, -, ... • node-set expressions: location paths filtered by predicates • node-set functions

  16. Node-Set Functions • last(): returns context size • position(): returns context position • count(S): returns number of nodes in S • name(S): returns name of first node in S • id(S): returns nodes who have an ID-type attribute with a value in S • e.g. • position()=2: true if node is 2nd in the context • position()=last(): true if node is last in the context

  17. Examples • count(//performance): the number of performance elements • //performance[not(date)]: performance elements that do not have a date element as a child • all CD elements that have "Deutsche Grammophon" as publisher and have more than 1 performance element as child: //CD [publisher="Deutsche Grammophon" and count(performance) > 1] • or //CD [publisher="Deutsche Grammophon"] [count(performance) > 1] • or //CD [count(performance) > 1] [publisher="Deutsche Grammophon"]

  18. More examples • //CD/performance[position()=2] • returns the second performance of each CD • //CD/performance[position()=2][date] • returns the second performance of each CD if it has a date (otherwise, returns nothing) • //CD/performance[date and position()=2] • returns the same • //CD/performance[date][position()=2] • returns the second of those performance children of each CD that have a date (if any)

  19. Full location Steps • using full, not abbreviated, syntax • a location step has the form axis :: node-test predicates where • axis selects a set of candidate nodes • node-test filters candidates based on node type or name • optional predicates • in child::CD[attribute::publisher="Deutsche Grammophon"] • child and attribute are axes • CD and publisher are node-tests

  20. Axes • axis specifies what nodes, relative to context node(s), to consider • there are 13 axes defined • self: the context node itself • parent: the parent of the context node (note: parent of root is empty) • attribute: all attributes of the context node • namespace: all namespace nodes of the context node • child, ancestor, descendant (see later) • ancestor-or-self: ancestors and the context node • descendant-or-self: descendants and the context node • preceding-sibling, following-sibling, preceding, following (see later)

  21. Axes: parent, child, ... • context node (and self axis) in yellow • nodes in parent axis in black • nodes in child axis in white • nodes in preceding-sibling axis in green • nodes in following-sibling red

  22. Axes: ancestor, descendant, ... • context node C (and self axis) in yellow • ancestor (black): elements whose start tag precedes start tag of C and whose end tag follows end tag of C • descendant (white): elements whose start tag follows start tag of C and whose end tag precedes end tag of C • preceding (green): elements whose end tag precedes start tag of C • following (red): elements whose start tag follows end tag of C • preceding, following, ancestor, descendant and self together partition the tree into 5 subtrees

  23. Node Tests • axes other than attribute and namespace include elements, text nodes, comments and processing instructions • principal type of these axes is element • node test further restricts nodes considered • by node name • chapter: nodes with name "chapter" • *: nodes with any name (of the axis principal type) • by node type • node(): all nodes • text(): character data nodes • comment(): comment nodes • processing-instruction(): processing instruction nodes

  24. Examples • child::*[position()=2] • second child element • descendant::node() • all descendant nodes (elements, text nodes, comments or processing instructions) • following-sibling::*[position()=last()] • rightmost sibling element • child::section[position()=2]/child::subsection[position()=1] • first subsection of the second section

  25. Abbreviated syntax • if path starts with //, initial context is the root

  26. Examples using abbreviations • *[2] • second child element • //* • all descendant elements of the root • //text() • all text node descendants of the root • section[2]/subsection[1] • first subsection of the second section • .//@href • all href attributes in descendants of context node(s) • //section[.//image]/title • the titles of sections which contain images

  27. The family example <?xml version="1.0"?> <family> <parent pno="p1" role="mother" spouse="p2"> <name>Janet</name> </parent> <parent pno="p2" role="father" spouse="p1"> <name>John</name> </parent> <child cno="c1" siblings="c2 c3"> <name>Tom</name> </child> <child cno="c2" siblings="c1 c3"> <name>Dick</name> </child> <child cno="c3" siblings="c1 c2"> <name>Harry</name> </child> </family>

  28. Location paths on family • in the context of the family element, • parent[@spouse='p2']/name produces <name>Janet</name> the name of the person whose spouse attribute has value 'p2' • parent[id(@spouse)[name='John']]/name produces <name>Janet</name> the name of the person whose spouse is named John

  29. More location paths on family • in the context of the family element, • id(child[name='Dick']/@siblings)/name produces <name>Tom</name> <name>Harry</name> the names of Dick's siblings • child[id(@siblings)[name='Tom']]/name produces <name>Dick</name> <name>Harry</name> the names of the children who have Tom as a sibling

  30. More Information • www.w3.org/TR/xpath W3C's Recommendation on XPath • www.vbxml.com/xpathvisualizer/ page for downloading the XPath visualiser • www.w3schools.com/xpath/ XPath tutorial • www.vbxml.com/xsl/tutorials/intro/default.asp XSLT and XPath tutorial

More Related