310 likes | 508 Views
XPath. Tao Wan March 04, 2002. What is XPath?. A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. Primary purpose: Address ‘part’ of an XML document, and provide basic facilities for manipulation of strings, numbers and booleans. Outline.
E N D
XPath Tao Wan March 04, 2002
What is XPath? • A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. • Primary purpose: Address ‘part’ of an XML document, and provide basic facilities for manipulation of strings, numbers and booleans.
Outline • Introduction • Data Model • Xpath Syntax • Location Path • General Xpath Expressions • Core Function Library • XPath utilities • Conclusion
Introduction • W3C Recommendation. November 16, 1999 • Latest version: http://www.w3.org/TR/xpath • XPath uses a compact, string-based, rather than XML element-based syntax. • Operates on the abstract, logical structure of an XML document rather than its surface syntax. • Uses a path notation (like in URLs) to navigate through this hierarchical tree structure. Introduction
Introduction Cont. • Xpath models an XML doc as a tree of nodes and defines a way to compute a string-value for each type of node. • Supports Namespaces. • Expression (Expr) is the primary syntactic construct of Xpath. Introduction
Data Model • The way to represent an XML document. • This tree consists of 7 nodes: • Root Node • Element Nodes • Attribute Nodes • Namespace Nodes • Processing Instruction Nodes • Comment Nodes • Text Nodes • The tree structure is ordered in order of the occurrence of nodes’ start-tag in the XML doc. Data Model
Data Model Example <?xml version=“1.0”> <?xml-stylesheet type=“text/xsl” href=“bib.xsl” ?> <! -- simple XML document --> <bib><book price=“25.00” pages=“400”> <publisher> IDG books</publisher> <author> <first-name>Rick</first-name> <last-name> Hull </last-name> </author> <author> Simon North</author> <title> XML complete </title> <year> 1997 </year></book><book> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database </title> <year> 1998 </year></book> </bib> Data Model
Xpath Syntax • Expression is the primary syntactic construct in XPath • Evaluated to yield an object of 4 basic types. • node-set (unordered collection of nodes without duplicates). • boolean (true/false) • number (float) • string (sequence of UCS chars) • Expression Evaluation occurs will respect to a context. (XSLT/XPointer specified context) • Location path is one important kind of expression. • Location paths select a set of nodes relative to the context node. Expression
Location Path • Location Path provides the mechanism for ‘addressing’ parts of an XML doc, similar to file system addressing. Ex: /book/year (select all the year elements that have a book parent) • Every location path can be expressed using a straightforward but rather verbose syntax: • unabbreviated syntax (verbose syntax) Ex: child::* (select all element children of the context node) • abbreviated syntax Ex. * (equivalent to unabbreviation above) Location Path
Location Path Cont. • Two types of paths: Relative & Absolute • Relative location path: consists of a sequence of one or more location steps separated by / • absolute location path: consists of / optionally followed by a relative location path • Composed of a series of steps (1 or more) Ex. Child::bib/child::book (select the book element children of the bib element children of the context node) Ex. / (select the root node of the document containing the context node) Location Path
Location Path Examples • Verbose syntax (has syntactic abbreviations for common cases)Examples (unabbreviated) • child::book selects the book element children of the context node • child::* selects all element children of the context node • attribute::price selects the price attribute of the context node • descendant::book selects all bookdescendants of the context node • self::book selects the context node if it is a book element (otherwise selects nothing) • child::*/child::book selects all bookgrandchildren of the context node • / selects the document root (which is always the parent of the document element) Location Path
Location Steps • 3 parts • axis (specifies relationship btwn selected nodes and the context node) • node test (specifies the node type and expanded-name of selected nodes) • predicates (arbitrary expressions to refine the selected set of nodes) • The syntax for location step is the axis name and node test separated by a double colon followed by zero or more expressions, each in square bracket. • Evaluate a location step is to generate an initial node-set from axis (relationship to context node) and node-test (node-type and expanded-name), then filter that node-set by each of the predicates in turn. ex: child::book[position( )=1] child is the name of the axis, book is the node test, and [position()=1] is a predicate • ex: descendant::book[position( )=1] • selects the all book element descendants of the context node firstly, then filter the one • which is first book descendant of context node. Location Step
Location Steps We’ve only seen these, so far • Axes • 13 axes defined in XPath • Ancestor, ancestor-or-self • Attribute • Child • Descendant, descendant-or-self • Self • Following • Preceding • Following-sibling, preceding-sibling • Namespace • Parent • Node test • Identifies type and expanded-name of node. • Can use a name, wildcard or function to evaluate/verify type and name. ex. Child::text() select the text node children of context node. Child::book select book element children of context node. Attribute::* select all attribute children of context node. Location step
Location Step Cont. • Predicate • A predicate filters a node-set with respect to an axis to produce a new node-set. • Use XPath expressions (normally, boolean expressions) in square brackets following the basis (axis & node test). Ex. Child::book[attribute::price=“25”] (select all book children of the context node that have a price attribute with value 25. • A predicateExpr is evaluated by evaluating the Expr and converting the result to a boolean (True or False)
Examples • Axis and Node Test: descendant::publisher (selects the publisher elements that are descendant of the context node) attributes::* (selects all attributes of the context node) • Basis and Predicate: child::book[3] (selects the 3rdbook of the children of the context node) child::*[self::author or self::year][position()=last()] (selects the last author or year child of the context node) child::book[attribute::page=“400”][5] (selects the fifth book child of the context node that has a page attribute with value 400) Location Path
Abbreviated Syntax • Abbreviated syntax is the simpler way to express location path. • For common case, abbreviation can be used to express concisely (not every case). • Each abbreviation can be converted to unabbreviated one. child:: can be omitted from a location step (child is the default axis)ex. bib/book is equivalent to child::bib/child::book attribute:: can be abbreviated to @ ex. Book[@price=“25”] is short for child::book[attribute::price=“25”] // is short for /descendant-or-self::node()/ ex. Book//author is short for book/descendant-or-self::node()/child::author A location step of . is short for self::node()ex: .//book is short for self::node()/descendant-or-self::node()/child::book Location step of .. is short for parent::node() ex. ../title is short for parent::node()/child::title Location Path
Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Function Calls Expressions
Function Calls • Function call expression is evaluated by using the FunctionName to identify a function in the expression evaluation context function library. • An argument is converted • to type string (as if calling the string function), • to type boolean (as if calling the Boolean function), • to type number (as if calling the number function), • An argument that is not of type node-set cannot be converted to a node-set. Ex. position() function returns the current node’s position in the context node list as a number. Expressions
Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions
Node-sets • A location path can be used as an expression. • The expression returns the set of nodes selected by the path. Expressions
Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions
Booleans • A boolean can only have two values: true or false • The following operators can be used in boolean expressions or combine two boolean expressions according to the usual rules of boolean logic: • or • and • =, != • <=, <, >=, > Ex. Book=‘XML complete’ or book=‘Principles of Database Expressions
Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions
Numbers • A number represents a floating-point number, no pure integers exist in Xpath. • The basic arithmetic operators include: +, -, *, div and mod. Ex. @id div 10 Expressions
Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions
Strings • Strings consist of a sequence of zero or more character. • May be enclosed in either single or double quotes. • Comparison operators: =, != Expressions
Core Function Library • XPath defines a core set of functions to evaluate expressions. • All implementations of Xpath must implement the core function library. • Four type of functions: • Node Set Functions: operate on or return info about node sets. • String Functions: are used for basic string operations. Ex. substring(“12345”, 0, 3) returns “12” • Boolean Functions: all return true or false. • Number Functions: are used for basic number operations. Core Library
Xpath Utilities • Miscellaneous utilities related to Xpath • http://www.xmlsoftware.com/xpath/ • XPath Visualiser: • This is a powerful tool for the evaluation of an XPath expression and visual presentation of the resulting node-set. • allowing you to experiment with XPath for finding the correct expression. • The display of the XML source document is similar to the default IE display with the same syntax color and collapsible & expandable container nodes. • very straightforward XPath learning process. Xpath Utilities
XPath Visualiser Context Node Xpath input Tree View of XML Doc Xpath evaluating result Result is highlighted Xpath Utilities
Conclusion • Xpath is complete pattern match language. • Provides an concise way for addressing parts of an XML document. • Base for XSLT, Xpointer and XML Query WG. Supported by W3C. • Implementing XPath basically requires learning the abbreviated syntax of location path expressions and the functions of the core library. Conclusion
Reference • XML Path Language (XPath) V1.0 http://www.w3.org/TR/xpath • XML in a Nutshell http://www.oreilly.com/catalog/xmlnut/chapter/ ch09.html • Managing XML and Semistructured Datahttp://www.cs.washington.edu/homes/suciu/COURSES/590DS/06xpath.htm • Xpath utilities http://www.xmlsoftware.com/xpath/ Xpath Reference