1 / 29

Bottom-up Evaluation of XPath Queries

Bottom-up Evaluation of XPath Queries. Stephanie H. Li Zhiping Zou. Outline. Overview of XPath Motivation Algorithms : bottom-up evaluation Design and implementation. Introduction- Overview. Overview of Xpath

hazina
Download Presentation

Bottom-up Evaluation of XPath Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou

  2. Outline • Overview of XPath • Motivation • Algorithms : bottom-up evaluation • Design and implementation

  3. Introduction- Overview • Overview of Xpath • XPath is a querying language and is designed for addressing nodes of XML documents. • Data model • Syntax • Expressions • Location paths • Operators • Functions • Evaluation(context)

  4. Data Model • Data Model • XML document = tree of nodes • 7 kinds of nodes: • Element • Attribute • Text • Namespace • Processing-instruction • Comment • Document (root) nodes.

  5. r The root node a The root element b b b b Data Model(Example) <a> <b/> <b/> <b/> <b/> </a>

  6. Expression • XPath uses expressions to select nodes from XML documents • The main types of expressions are Location Paths, Functions and operators

  7. Location Paths • Although there are many different kinds of XPath expressions, the one that’s of primary use in Java programs is the location path. • Location Path: /child::movies/child::movie[position()=5] step axis nodetest predicate location path

  8. Location Step • Axis::Nodetest[predicts] • Axis: chooses the direction to move from the context node • Node test: determines what kinds of nodes will be selected along that axis • Predicts: further filter the node-set.

  9. XPath Axis • Axis---main navigator for a XML doc ancestor : nodes along the path to the root ancestor-or-self : same but including the context node child : children of the context node descendant : descendants of the context node descendant-or-self : same but including the context node following : nodes after the context node in document order, excluding descendants following-sibling : following sibling of the context node parent : the parent of the context node preceding : nodes before the context node in document order,excluding ancestors preceding-sibling : preceding sibling of the context node

  10. Node Test • Node Type test • Example • T(root()) = {r}, • T(element()) = {a; b1; : : : ; b4} • T(element(a))= {a} • T(element(b)) = {b1; : : : ; b4} • Node Name test • Element node name

  11. Operators and Functions • Arithmetic Ops • Ops for comparisons and boolean logic: • {<,>,<=,>=,=,!=} {or, and} • Functions • Position() • Last()

  12. Xpath Query Evalutation • Query evaluation is a major algorithmic problem • Main construct is the expression • Each expression is evaluated to yield an object one of these four types: • Node-set (an unordered collection of nodes without duplicates ) • Boolean(true or false) • Number(a floating-point number ) • String

  13. Context • All XPath expressions are evaluated w.r.t. a Context,which consists of • A context node • A context position(int) • A context size(int) • The input context for query evaluation is chosen by the user.

  14. Motivation • Claim: • The way XPath is defined in W3C XPath recommendation motivates an inefficient implementation (exponential-time). • This paper propose more efficient way (polynomial-time)

  15. Basic query evaluation strategy Procedure process-location-step(n0, Q) /* n0 is the context node; query Q is a list of location steps */ Begin node set S := apply Q.first to node n0; if (Q.tail is not empty) then for each node n ∈ S do process-location-step(n, Q.tail); End Time(|Q|) = |D| * Time(|Q|-1) or |D||Q| when |Q| > 0 1 when |Q| = 0 The algorithm recursively evaluates each remaining step for each matching node of the current step

  16. Xpath Evaluate in PTime • Theorem: Let e be an arbitrary XPath expression. Then, for context node x, position k, and size n, the value of e is v, where v is the unique value such that <x,k,n,v>∈E↑[e] • The main principle that the paper propose to obtain an XPath evaluation algorithm with PTime complexity is the notion of a context-value table(CVT)

  17. Context-value table Principle • Given an expression e, the CVT of e specifies all valid combinations of contexts c<x,k,n> and values v, s.t. e evaluates to v in context c<x,k,n> • Such a table for expression e is obtained by first computing the CVTs of the direct subexpressions of eand then combining them into the CVT for e. • The size of each of the CVTs has a polynomial bound • Each of the combination steps can be effected in PTime • Thus, query evaluation in total under our principle also has a PTime bound

  18. Bottom-up evaluation of XPath

  19. Bottom-up evaluation of XPath Algorithm (Bottom-up algorithm for XPath) Input: An XPath query Q; Output: E↑[Q] Method: Let Tree(Q) be the parse tree of query Q; R:=Ø; For each atomic expression l ∈ leaves(Tree(Q)) do compute table E↑[l] and add it to R; [Note: we use JDom to do this] While E↑[root(Tree(Q))]! ∈ R do Begin take an Op(l1,…ln) nodes(Tree(Q)) s.t. E↑[l1],… E↑[ln] ∈ R; compute E↑[Op(l1,…ln)] using E↑[l1],…, E↑[ln]; add E↑[Op(l1,…ln)] to R; End; Return E↑[root(Tree(Q))] By a bottom-up algorithm we mean a method of processing XPath while traversing the parse tree of the query from its leaves up to its root.

  20. Bottom-up evaluation of XPath • Example • XML : • <?xml version="1.0"?> • <people> • <person born="1912" died="1954" id="p342"> • <name> Alan Turing </name> • <!-- Did the word computer scientist exist in Turing's day? --> • <profession>computer scientist</profession> • <profession>mathematician</profession> • <profession>cryptographer</profession> • <homepage>href="http://www.turing.org.uk/"</homepage> • </person> • <person born="1918" died="1988" id="p4567"> • <name>Richard M. Feynman</name> • <profession>physicist</profession> • <hobby>Playing the bongoes</hobby> • </person> • </people>

  21. Example: XML Doc Tree

  22. Example: XPath Query tree • Parse tree XPath query: descendant:: profession/following-sibling::*[position()!= last()]

  23. Example: Evaluate subexpressions

  24. Example: Evaluate subexpressions

  25. Example: Evaluate subexpressions

  26. Design and Implementaion • Environment • Java,JDK1.5.0 • Jdom1.0 • XPath1.0 • Features: • Only Element nodes are queried • Not support abbreviated xpath expressions • Not support format of location steps in predicts.

  27. System Structure XML file Query Context node User input (MyDriver.java) JDom XML parser (org.jdom.input.SAXBuilder) Query Parser (Parser.java BinaryTree.java,Node.java) XML document tree Query tree Evaluator( QueryEval.java) Context value tables (ContextValTable.java and others) Result for the full xpath query

  28. Conclusion • XPath query evaluation algorithm that runs in polynomial time with respect to the size of both the data and the query (linear in the size of queries and quadratic in the size of data) • No optimization, strictly coheres to the specification given in the paper

  29. References • G. Gottlob, C. Koch, and R. Pichler. "Xpath Processing in a Nutshell". In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE'03), Bangalore, India, Mar. 2003. • G. Gottlob, C. Koch, and R. Pichler. "Efficient Algorithms for Processing XPath Queries". In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB'02), Hong Kong, China, Aug. 2002. • G. Gottlob, C. Koch, and R. Pichler. "XPath Query Evaluation: Improving Time and Space Efficiency". In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE'03), Bangalore, India, Mar. 2003. • http://www.ibiblio.org/xml/books/xmljava/chapters/ch16.html

More Related