280 likes | 383 Views
XQuery. XML vs. Relational Data. { row : { name : “John”, phone : 3634 }, row : { name : “Sue”, phone : 6343 }, row : { name : “Dick”, phone : 6363 } }. row. row. row. phone. phone. phone. name. name. name. “Sue”. “John”. 3634. 6343. “Dick”. 6363. Relation … in XML.
E N D
XML vs. Relational Data { row: { name: “John”, phone: 3634 }, row: { name: “Sue”, phone: 6343 }, row: { name: “Dick”, phone: 6363 } } row row row phone phone phone name name name “Sue” “John” 3634 6343 “Dick” 6363 Relation … in XML
Query Language for XML • Must be high-level; “SQL for XML” • Must conform to XSchema • But also work in absence of schema info • Support simple and complex/nested datatypes • Support universal and existential quantifiers, aggregation • Operations on sequences and hierarchies of doc structures • Capability to transform and create XML structures
“ A query language that uses the structure of XML intelligently and can express queries across all kinds of data, whether physically stored in XML or viewed as XML via middleware. This specification describes a query language called XQuery, which is designed to be broadly applicable across many types of XML data sources.”
XQuery • is an emerging standard for querying XML documents • is strongly influenced by OQL • is a functional language in which a query is represented as an • expression (opposed to OQL and SQL which are declarative) • expressions can be nested • filters can strip out fields • has grouping
Uses of XQuery • Extracting information from database • Generating summary reports on data stored • Searching textual documents on the web • Selecting and transforming XML data to XHTML • Pulling data from databases for application integration • Splitting up an XML document
XQuery Design goals • Useful for both – structured and unstructured data • Protocol independent(evaluation with predictable results) • Able to accept collection of multiple documents • Compatible with other W3C standards
Overview of XQuery • Path expressions • Element constructors • FLWOR (“flower”) expressions • Several other kinds of expressions as well, including conditional expressions, list expressions, quantified expressions, etc. • Expressions evaluated w.r.t. a context: • Context item (current node) • Context position (in sequence being processed) • Context size (of the sequence being processed) • Context also includes namespaces, variables, functions, date, etc.
Path Expressions Examples: • Bib/paper • Bib/book/publisher • Bib/paper/author/lastname Given an XML document, the value of a path expression p is a set of objects
Path Expression Examples Bib &o1 Doc = paper paper book references &o12 &o24 &o29 references references author page author year author title http title title publisher author author author &o43 &25 &o44 &o45 &o46 &o52 &96 1997 &o51 &o50 &o49 &o47 &o48 first last firstname lastname lastname firstname &o70 &o71 &243 &206 “Serge” “Abiteboul” “Victor” 122 133 “Vianu” Bib/paper = <&o12,&o29> Bib/book/publisher = <&o51> Bib/paper/author/lastname = <&o71,&206> Note that order of elements matters!
Element Construction • An XQuery expression can construct new values or structures • Example: Consider the path expressions from the previous slide. • Each of them returns a newly constructed sequence of elements • Key point is that we don’t just return existing structures or atomic values; we can re-arrange them as we wish into new structures
FLWOR Expressions • FOR-LET-WHERE-ORDERBY-RETURN = FLWOR FOR / LET Clauses List of tuples WHERE Clause List of tuples ORDERBY/RETURN Clause Instance of XQuery data model
For clause uses XPath expressions, and variable in for clause ranges over values in the set returned by XPath • Simple FLWOR expression in XQuery • find all accounts with balance > 400, with each result enclosed in an <account_number> .. </account_number> tagfor $x in /bank-2/account let $acctno := $x/@account_number where $x/balance > 400 return <account_number> { $acctno } </account_number> • Items in the return clause are XML text unless enclosed in {}, in which case they are evaluated • Let clause not really needed in this query, and selection can be done In XPath. Query can be written as: for $x in /bank-2/account[balance>400] return <account_number> { $x/@account_number } </account_number>
FOR vs. LET • FOR$xIN list-expr • Binds $x in turn to each value in the list expr • LET$x = list-expr • Binds $x to the entire list expr • Useful for common sub-expressions and for aggregations
FOR vs. LET: Example Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ... FOR$xINdocument("bib.xml")/bib/book RETURN <result> $x </result> Notice that result has several elements Returns: <result> <book>...</book> <book>...</book> <book>...</book> ... </result> LET$xINdocument("bib.xml")/bib/book RETURN <result> $x </result> Notice that result has exactly one element
XQuery Example 1 Find all book titles published after 1995: FOR$xINdocument("bib.xml")/bib/book WHERE$x/year > 1995 RETURN$x/title Result: <title> abc </title> <title> def </title> <title> ghi </title>
XQuery Example 2 For each author of a book by Morgan Kaufmann, list all books she published: FOR$aINdistinct(document("bib.xml")/bib/book[publisher=“Morgan Kaufmann”]/author) RETURN <result> $a, FOR$tIN /bib/book[author=$a]/title RETURN$t </result> distinct = a function that eliminates duplicates (after converting inputs to atomic values)
Results for Example 2 <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result> Observe how nested structure of result elements is determined by the nested structure of the query.
XQuery Example 3 <big_publishers> FOR$pINdistinct(document("bib.xml")//publisher) LET$b := document("bib.xml")/book[publisher = $p] WHEREcount($b) > 100 RETURN$p </big_publishers> For each publisher p • Let the list of books • published by p be b Count the # books in b, and return p if b > 100 count = (aggregate) function that returns the number of elements
XQuery Example 4 Find books whose price is larger than average: LET$a=avg(document("bib.xml")/bib/book/price) FOR$b in document("bib.xml")/bib/book WHERE$b/price > $a RETURN$b
Collections in XQuery • Ordered and unordered collections • /bib/book/author = an ordered collection • Distinct(/bib/book/author) = an unordered collection • Examples: • LET$a = /bib/book $a is a collection; stmt iterates over all books in collecion • $b/author also a collection (several authors...) Returns a single collection! <result> <author>...</author> <author>...</author> <author>...</author> ... </result> However: RETURN <result> $b/author </result>
Collections in XQuery What about collections in expressions ? • $b/price list of n prices • $b/price * 0.7 list of n numbers?? • $b/price * $b/quantity list of n x m numbers ?? • Valid only if the two sequences have at most one element • Atomization • $book1/author eq "Kennedy" - Value Comparison • $book1/author = "Kennedy" - General Comparison
Sorting in XQuery <publisher_list> FOR$pINdistinct(document("bib.xml")//publisher) ORDERBY $p RETURN <publisher> <name> $p/text() </name> , FOR$bIN document("bib.xml")//book[publisher = $p] ORDERBY$b/priceDESCENDING RETURN <book> $b/title , $b/price </book> </publisher> </publisher_list>
Conditional Expressions: If-Then-Else FOR$h IN //holding ORDERBY $h/title RETURN <holding> $h/title, IF$h/@type = "Journal" THEN$h/editor ELSE$h/author </holding>
Existential Quantifiers FOR$b IN //book WHERESOME$p IN $b//paraSATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN$b/title
Universal Quantifiers FOR$b IN //book WHEREEVERY$p IN $b//paraSATISFIES contains($p, "sailing") RETURN$b/title
Other Stuff in XQuery • Before and After • for dealing with order in the input • Filter • deletes some edges in the result tree • Recursive functions • Namespaces • References, links … • Lots more stuff …
XQuery Tools • XQuery Editor • XQuery Mapper • XQuery Debugger • XQuery Profiler • XQuery Documentation Generator • XML Schema Aware Query processing • Invoking XQuery from web services