1 / 34

Database Systems I Query Languages for XML

Database Systems I Query Languages for XML. Query Languages for XML. XPath is a simple query language based on describing similar paths in XML documents. XQuery extends XPath in a style similar to SQL, introducing iterations, subqueries, etc.

Download Presentation

Database Systems I Query Languages for XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Systems I Query Languages for XML

  2. Query Languages for XML • XPath is a simple query language based on describing similar paths in XML documents. • XQuery extends XPath in a style similar to SQL, introducing iterations, subqueries, etc. • XPath and XQuery expressions are applied to an XML document and return a sequence of qualifying items. • Items can be primitive values or nodes (elements, attributes, documents). • The items returned do not need to be of the same type.

  3. XPath • A path expression returns the sequence of all qualifying items that are reachable from the input item following the specified path. • A path expression is a sequence consisting of tags or attributes and special characters such as slashes (“/”). • Absolute path expressions are applied to some XML document and returns all elements that are reachable from the document’s root element following the specified path. • Relative path expressions are applied to an arbitrary node.

  4. XPath <?XML version=“1.0” standalone =“yes” ?> <bibliography> <book bookID = “b100“> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> • Applied to the above document, the XPath expression /bibliography/book/author returns the sequence <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> . . .

  5. Attributes • If we do not want to return the qualifying elements, but the value one of their attributes, we end the path expression with @attribute. • Applied to the above document, the XPath expression /bibliography/book/@bookID returns the sequence “b100“ . . .

  6. Axes • XPath provides a variety of axes, i.e. modes of navigation through semistructured data. • At each step of a path expression, we can prefix a tag or attribute name by an axis name and a colon. • For example, the path expression /child::bibliography/child::book/attribute::bookID is equivalent to /bibliography/book/@bookID. • Descendants are all direct and indirect children of a node.

  7. Axes • Axes include • parent, • ancestor, • descendant, • next-sibling, • previous-sibling, • self, and • descendant-or-self. • XPath has the following shorthands for axes: / child, // descendant-or-self, @ attribute, . self, .. parent.

  8. Axes <bibliography> <book bookID = “b100“> <title> Foundations… </title> <author affiliation = “IBM“> Abiteboul </author> <author> Hull </author> . . . </book> <article articleID = “a245“> <header> <author authorID = “a739“> Codd </author> <title> A relational database model </title> </header> <body> . . . </body> </article> </bibliography> • Applied to the above document, the path expression /bibliography//author returns the sequence <author> Abiteboul </author> <author> Hull </author> <author> Codd </author> .

  9. Wildcards • We can use wildcards instead of actual tags and attributes:* means any tag, and @* means any attribute. • Examples/bibliography/*/author returns the sequence <author> Abiteboul </author> <author> Hull </author>./bibliography//author/@* returns the sequence “IBM“ “a739“.

  10. Conditions • We can restrict the qualifying paths to those that satisfy a given condition, surrounded by square brackets. • Conditions can be anything returning a boolean value. • In particular, conditions can be: [<subpath>=<value>] there exists a subpath with the specified value [i] the element is the i-th element of the specified type • Example /bibliography/book[/title=“Foundations…”]/author[2] returns <author> Hull </author>.

  11. for/let clauses sequence of items where clause sequence of items order-by/return clause XQuery • XQuery extends XPath, i.e. every XPath expression is an XQuery expression. • Beyond XPath expressions, XQuery introduces FLWOR expressions. • Format: for let where order-by return

  12. XQuery • FLWOR expressions are similar to SQL select . . from . . . where . . . queries. • XQuery allows zero, one or more for and let clauses. • The where clause is optional. • There is one optional order-by clause. • Finally, there is exactly one return clause. • XQuery is case-sensitive. • XQuery (and XPath) is a W3C standard.

  13. XQuery • XQuery is a functional language. • Any XQuery expression can be used in any place that an expression is expected. • SQL also allows subqueries in many places. However, SQL does, e.g., not allow any subquery to be any operand of any comparison in a WHERE clause. • This implies that every XQuery operator must be defined for operands that are sequences of items, not just for individual items.

  14. XQuery Clauses • for $x in expr • Defines node variable$x. • The expression expr evaluates to a sequence of items. • The variable $x is assigned to each item, in turn, and the body of the for clause is executed once for each assignment. • let $x := expr • Defines collection variable$x. • The expression expr evaluates to a sequence of items. • The variable is bound to the entire sequence of items. • Useful for common subexpressions and for aggregations.

  15. XQuery Clauses • where condition • The condition is a boolean expression. • The clause is applied to some item. • If and only if the condition evaluates to true, the following return clause is executed for that item. • return expression • The result of a FLWOR clause is a sequence of items. • Expression defines the result format for the current (qualifying) item. • The sequence of items produced by expression is appended to the sequence of items produced so far.

  16. Document Nodes • The context for a for or let clause is often provided by a document node. • Typically, the document comes from a file. • The doc function constructs a document node from a file with a given name. • Examplesdoc("bib.xml") doc(“infolab.stanford.edu/~hector/movies.xml”)

  17. Interpretation as XQuery Expression • XQuery expressions can be used wherever an XML expression of any kind is permitted. • Any text string is acceptable as content of a tag or value of an attribute. • If a string contains an XQuery expression that should be evaluated, this substring must be surrounded by curly brackets {}. • Examplefor $b indoc("bib.xml")/bibliography/bookreturn <result id = {$b/@bookID}>{$b/title}</result>

  18. XQuery Examples • Find all books. • for vs. let Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ... for$xindoc("bib.xml")/bibliography/book return <result> {$x} </result> Returns: <result> <book>...</book> <book>...</book> <book>...</book> ... </result> let$x:=doc("bib.xml")/bibliography/book return <result> {$x} </result>

  19. XQuery Examples • Find all titles of books published after 1995. for$xindoc("bib.xml")/bibliography/book where$x/year > 1995 return$x/title Result: <title> abc </title> <title> def </title> <title> ghi </title>

  20. Ordering the Query Result • The order-by clause allows you to order the results of an XQuery expression. order-by list of expressions • The sort order is based on the value of the first expression. Ties are broken based on the value of the second (if necessary third etc.) expression. • By default, the order is ascending. • A descending sort order can be specified using descending.

  21. Elimination of Duplicates • The built-in function distinct-values eliminates duplicates from a sequence of result items. • In principle, it applies only to primitive (atomic) types. • It can also be applied to elements, but then it will remove their tags, replacing them by quotes “”. • Example If return $b/title produces <title> aaa </title> <title> bbb </title> <title> aaa </title> then distinct-values (return $b/title) produces “aaa” “bbb”.

  22. XQuery Examples • Find all books published by Morgan Kaufman and list them in descending order of their prices. • Uses order-by withoption descending. for$bindoc("bib.xml")/bibliography/book[publisher=“Morgan Kaufmann”]) order-by $b/price descending return $b

  23. XQuery Examples • For each author of a book published by Morgan Kaufmann, list the author and the titles of all books she published. • Uses nested subquery andfunction distinct-values. for$aindistinct-values(doc("bib.xml")/bibliography/book[publisher=“Morgan Kaufmann”]/author) return <result> {$a} {for$tin /bib/book[author=$a]/title return$t} </result>

  24. XQuery Examples Result: <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>

  25. Joins • We can join two or more documents, by using one variable for each of the documents . • We let a variable range over the elements of the corresponding document, within a for-clause. • Need to be careful when comparing elements for equality, since their equality is by element identity, not by element content. • Typically, we want to compare the element content. • The built-in function data(E) returns the content of an element E.

  26. Example • Find all pairs of titles of books from the same year. • Uses two variables ranging over booksand the data function applied to their year elements. let $books:=doc("bib.xml") for$b1indoc("bib.xml")/bibliography/book,$b2indoc("bib.xml")/bibliography/book wheredata($b1/year) = data($b2 /year) return <result>{$b1/title} {$b2/title} </result>

  27. Comparison Operators • XQuery supports the standard comparison operators such as <, >, =. • Comparison operators are applied to a sequence of items. • Comparisons have an existential nature. I.e., they return true if and only if at least one of the items satisfies the condition of the comparison. • for $b in doc("bib.xml")/bibliography/book/ where $b/author/firstname = “A” and $b/author/lastname = “B” return $b Books returned can have one author with firstname A and another author with lastname B.

  28. Comparison Operators • XQuery also supports special comparison operators that only compare sequences consisting of a single item: eq, ne, lt, gt, ge. • These comparisons fail if one of the operands contains more than one item. • XQuery also provides built-in functions for approximate string matching, in particular contains($p, "windsurfing").

  29. Quantification • XQuery supports the existential and the universal quantifier. • Universal quantifierevery $v in expression1 satisfies expression 2 • Existential quantifier some $v in expression1 satisfies expression 2 • Expression1 evaluates to a sequence of items, expression 2 is a boolean expression.

  30. Aggregation • XQuery provides built-in functions for the standard aggregations such as SUM, MIN, COUNT and AVG. • They can be applied to any XQuery expression, i.e. to any sequence of items. • Exampleavg(doc("bib.xml")/bibliography/book/price)count(doc("bib.xml")/bibliography/book/price)Computes the average book price and the number of books, resp.

  31. XQuery Examples • Find books whose price is larger than the average price. • Uses aggregate operator (avg), applied to the result of a path expression. let$a:=avg(doc("bib.xml")/bibliography/book/price) for$bindoc("bib.xml")/bibliography/book where$b/price > $a return$b

  32. XQuery Examples • Find title of books with a paragraph containing the terms “sailing” and “windsurfing”. • Uses existential quantifier (some) and string matching (contains). for$bindoc("bib.xml")//book where some$pin$b//parasatisfies contains($p, "sailing") and contains($p, "windsurfing") return$b/title

  33. XQuery Examples • Find the title of books where every paragraph contains the terms “sailing”. • Uses universal quantifier (every) and string matching (contains). for$bindoc("bib.xml")//book where every$pin$b//parasatisfies contains($p, "sailing") return$b/title

  34. Summary • XQuery is the standard XML query language. • It is a functional language, i.e. any XQuery expression can be used in any place where an expression is expected. • An XQuery expression consists of for, let, where, order and return clauses, of which some are optional. • The main new concept compared to SQL are path expressions that return sets of elements reachable via the given path. • Path expressions are defined in XPath, a sublanguage of XQuery. • In addition, XQuery has equivalent constructs for most of the main SQL constructs, in particular quantifiers and aggregate functions.

More Related