170 likes | 311 Views
XPath Query Evaluation - A Top Down Approach. Mohammed Pithapurwala (mp66@cse.buffalo.edu) Pejus Das (pejusdas@cse.buffalo.edu). Introduction. XPath Query Evaluation Uses: Select nodes in XML document XSLT, XQuery Polynomial V/s Exponential Top Down Algorithm. XPath. What is XPath ?
E N D
XPath Query Evaluation- A Top Down Approach Mohammed Pithapurwala (mp66@cse.buffalo.edu) Pejus Das (pejusdas@cse.buffalo.edu)
Introduction • XPath Query Evaluation • Uses: • Select nodes in XML document • XSLT, XQuery • Polynomial V/s Exponential • Top Down Algorithm
XPath • What is XPath? • child::section[position()<6] / descendant::cite / attribute::href • selects all href attributes in cite elements in the first 5 sections of an article document • Structure of XPath expression • Axes • Node types • Node test • Returns • Number, node set, string, boolean
Implementation • XPath Axes • Child • Parent • Descendant • Axes Functions • FirstChild • nextSibling • Child := firstchild.nextsibling* • Parent := (nextsibling-1)*.firstchild-1 • Descendant := firstchild.(firstchild nextsibling)*
Code Snippet public static Element firstChild(Element currNode) { Element fChild; fChild = null; List childNode = currNode.getChildren(); Iterator iterator = childNode.iterator(); if(iterator.hasNext()) { fChild = (Element) iterator.next(); } return(fChild); }
Node Test & Expressions • Node Test Expression • T(node()) = all nodes in the document • T(attribute(href)) – all nodes labelled href • attribute(S) := child(S) T(attribute()) • Node Numbering • < doc, X • The node order relative to the axes X in document order • idxx(x,S) • Context • c = x, k, n • x: node • k: position of the node • n: context size • Evaluation of XPath relative to context
XPath Evaluation • X::t[e] • X {child, parent, descendant, ….} • t : node test expression • e: expression • Expressions • e {node set, number, string, boolean} • ArithOp {+, -, *, div, mod} • EqOP {, }
XPath Semantics x, k, n := P(x) position() (x, k, n) := k last() (x, k, n) := n For all other kinds of expressions, e = Op(e1, …, em) Op(e1, …, em)(c) := Op(e1(c),….,em(c)) maps a context to a value type.
Intuitive Algorithm P [::te1 … em (x) := begin S := {y | x y, y T(t)}; for 1 i m (in ascending order) do S := {y S | ei (y, idx(y,S), |S| = true}; return S; end; P1|2(x) := P1(x) P2(x) P/ (x) := P(root) P1/2(x) := Uy P[1](x)P2(y)
Runtime • Ex: • Doc: <a><b/><b/></a> • Query: //a/b/parent::a/b/parent::a/b • Construct more queries: /parent::a/b • procedure process-location-step(n0, Q) • /* n0 is the context node; query Q is a list of location steps */ • begin • node set S := apply Q.head to node n0; • if (Q.tail is not empty) then • for each node n 2 S do process-location-step(n, Q.tail); • End • Complexity: Time(|Q|) = |D||Q|
Algorithm • S::t[e1]…[em](X1, … ,Xk) := • begin • S := {x,y| x Xi , x y, and y T(t)}; • for each 1≤ i ≤ m (in ascending order) do • begin • Fix some order S = x1,y1 , …, xl,yl for S; • r1,…rl := ei(t1,…,tl) where tj = yj , idx (yj,, Sj ), |Sj| and Sj := {z | xj, z S}; • S := {xi,yi |ri is true}; • end; • for each 1 ≤ i ≤ k do • Ri := {y | x, y S, x Xi}; • return R1, … ,Rk ; • end;
Algorithm (contd….) S/(X1, …., Xk) := S({root}, …., k times) S1/2(X1, …., Xk) := S2(S1(X1, …., Xk)) S1|2(X1, …., Xk) := S1(X1, …., Xk) U (S2(X1, …., Xk))
Semantics Function (x1, k1, n1, …, xl, kl, nl) := S({x1}, …., {xl}) position()(x1, k1, n1, …, xl, kl, nl) := k1, …., kl last()(x1, k1, n1, …, xl, kl, nl) := n1, …., nl And Op(e1, …. em(c1, …., cl) := Op (e1(c1, …., cl), …., em(c1, …., cl)) For remaining kind of expressions
References • G. Gottlob, Ch. Koch, R. Pichler: XPath Processing in a Nutshell. SIGMOD Record, March'03. • G. Gottlob, Ch. Koch, R. Pichler: Efficient Algorithms for Processing XPath Queries. ACM TODS, to appear.