300 likes | 528 Views
Introduction to XQuery. Resources: Official URL: www.w3.org/TR/ xquery Short intros: http://www.xml.com/pub/a/2002/10/16/xquery.html www.brics.dk/~amoeller/XML/querying Or see Ramakrishnan & Gehrke text. Lecture modified from slides by Dan Suciu. { row : { name : “John”, phone : 3634 },
E N D
Introduction to XQuery Resources: Official URL: www.w3.org/TR/xquery Short intros: http://www.xml.com/pub/a/2002/10/16/xquery.html www.brics.dk/~amoeller/XML/querying Or see Ramakrishnan & Gehrke text Lecture modified from slides by Dan Suciu
{ row: { name: “John”, phone: 3634 }, row: { name: “Sue”, phone: 6343 }, row: { name: “Dick”, phone: 6363 } } XML vs. Relational Data row row row phone phone phone name name name “Sue” “John” 3634 6343 “Dick” 6363 Relation … in XML
Relational to XML Data • A relation instance is basically a tree with: • Unbounded fanout at level 1 (i.e., any # of rows) • Fixed fanout at level 2 (i.e., fixed # fields) • XML data is essentially an arbitrary tree • Unbounded fanout at all nodes/levels • Any number of levels • Variable # of children at different nodes, variable path lengths
Query Language for XML • Must be high-level; “SQL for XML” • Must conform to XSchema • But also work in absence of schema info • Support simple and complex/nested datatypes • Support universal and existential quantifiers, aggregation • Operations on sequences and hierarchies of doc structures • Capability to transform and create XML structures
XQuery • Influenced by XML-QL, Lorel, Quilt, YATL • Also, XPath and XML Schema • Reads a sequence of XML fragments or atomic values and returns a sequence of XML fragments or atomic values • Inputs/outputs are objects defined by XML-Query data model, rather than strings in XML syntax
Overview of XQuery • Path expressions • Element constructors • FLWOR (“flower”) expressions • Several other kinds of expressions as well, including conditional expressions, list expressions, quantified expressions, etc. • Expressions evaluated w.r.t. a context: • Context item (current node) • Context position (in sequence being processed) • Context size (of the sequence being processed) • Context also includes namespaces, variables, functions, date, etc.
Path Expressions Examples: • Bib/paper • Bib/book/publisher • Bib/paper/author/lastname Given an XML document, the value of a path expression p is a set of objects
Path Expression Examples Bib &o1 paper paper book Doc = references &o12 &o24 &o29 references references author page author year author title http title title publisher author author author &o43 &25 &o44 &o45 &o46 &o52 &96 1997 &o51 &o50 &o49 &o47 &o48 first last firstname lastname lastname firstname &o70 &o71 &243 &206 “Serge” “Abiteboul” “Victor” 122 133 “Vianu” Bib/paper = <&o12,&o29> Bib/book/publisher = <&o51> Bib/paper/author/lastname = <&o71,&206> Note that order of elements matters!
Element Construction • An XQuery expression can construct new values or structures • Example: Consider the path expressions from the previous slide. • Each of them returns a newly constructed sequence of elements • Key point is that we don’t just return existing structures or atomic values; we can re-arrange them as we wish into new structures
FLWOR Expressions • FOR-LET-WHERE-ORDERBY-RETURN = FLWOR FOR / LET Clauses List of tuples WHERE Clause List of tuples ORDERBY/RETURN Clause Instance of XQuery data model
FOR vs. LET • FOR$xIN list-expr • Binds $x in turn to each value in the list expr • LET$x = list-expr • Binds $x to the entire list expr • Useful for common sub-expressions and for aggregations
FOR vs. LET: Example Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ... FOR$xINdocument("bib.xml")/bib/book RETURN <result> $x </result> Notice that result has several elements Returns: <result> <book>...</book> <book>...</book> <book>...</book> ... </result> LET$xINdocument("bib.xml")/bib/book RETURN <result> $x </result> Notice that result has exactly one element
XQuery Example 1 Find all book titles published after 1995: FOR$xINdocument("bib.xml")/bib/book WHERE$x/year > 1995 RETURN$x/title Result: <title> abc </title> <title> def </title> <title> ghi </title>
XQuery Example 2 For each author of a book by Morgan Kaufmann, list all books she published: FOR$aINdistinct(document("bib.xml")/bib/book[publisher=“Morgan Kaufmann”]/author) RETURN <result> $a, FOR$tIN /bib/book[author=$a]/title RETURN$t </result> distinct = a function that eliminates duplicates (after converting inputs to atomic values)
Results for Example 2 <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result> Observe how nested structure of result elements is determined by the nested structure of the query.
XQuery Example 3 <big_publishers> FOR$pINdistinct(document("bib.xml")//publisher) LET$b := document("bib.xml")/book[publisher = $p] WHEREcount($b) > 100 RETURN$p </big_publishers> For each publisher p • Let the list of books • published by p be b Count the # books in b, and return p if b > 100 count = (aggregate) function that returns the number of elements
XQuery Example 4 Find books whose price is larger than average: LET$a=avg(document("bib.xml")/bib/book/price) FOR$b in document("bib.xml")/bib/book WHERE$b/price > $a RETURN$b
Collections in XQuery • Ordered and unordered collections • /bib/book/author = an ordered collection • Distinct(/bib/book/author) = an unordered collection • Examples: • LET$a = /bib/book $a is a collection; stmt iterates over all books in collecion • $b/author also a collection (several authors...) Returns a single collection! <result> <author>...</author> <author>...</author> <author>...</author> ... </result> However: RETURN <result> $b/author </result>
Collections in XQuery What about collections in expressions ? • $b/price list of n prices • $b/price * 0.7 list of n numbers?? • $b/price * $b/quantity list of n x m numbers ?? • Valid only if the two sequences have at most one element • Atomization • $book1/author eq "Kennedy" - Value Comparison • $book1/author = "Kennedy" - General Comparison
Sorting in XQuery <publisher_list> FOR$pINdistinct(document("bib.xml")//publisher) ORDERBY $p RETURN <publisher> <name> $p/text() </name> , FOR$bIN document("bib.xml")//book[publisher = $p] ORDERBY$b/priceDESCENDING RETURN <book> $b/title , $b/price </book> </publisher> </publisher_list>
Conditional Expressions: If-Then-Else FOR$h IN //holding ORDERBY $h/title RETURN <holding> $h/title, IF$h/@type = "Journal" THEN$h/editor ELSE$h/author </holding>
Existential Quantifiers FOR$b IN //book WHERESOME$p IN $b//paraSATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN$b/title
Universal Quantifiers FOR$b IN //book WHEREEVERY$p IN $b//paraSATISFIES contains($p, "sailing") RETURN$b/title
Other Stuff in XQuery • Before and After • for dealing with order in the input • Filter • deletes some edges in the result tree • Recursive functions • Namespaces • References, links … • Lots more stuff …
XML Schema • Includes primitive data types (integers, strings, dates, etc.) • Supports value-based constraints (integers > 100) • User-definable structured types • Inheritance (extension or restriction) • Foreign keys • Element-type reference constraints
Sample XML Schema <schema version=“1.0” xmlns=“http://www.w3.org/1999/XMLSchema”> <element name=“author” type=“string” /> <element name=“date” type = “date” /> <element name=“abstract”> <type> … </type> </element> <element name=“paper”> <type> <attribute name=“keywords” type=“string”/> <element ref=“author” minOccurs=“0” maxOccurs=“*” /> <element ref=“date” /> <element ref=“abstract” minOccurs=“0” maxOccurs=“1” /> <element ref=“body” /> </type> </element> </schema>
XML-Query Data Model • Describes XML data as a tree • Node ::= DocNode | ElemNode |ValueNode | AttrNode | NSNode | PINode | CommentNode | InfoItemNode | RefNode http://www.w3.org/TR/query-datamodel/2/2001
XML-Query Data Model Element node (simplified definition): • elemNode : (QNameValue, {AttrNode }, [ ElemNode | ValueNode])ElemNode • QNameValue = means “a tag name” Reads: “Give me a tag, a set of attributes, a list of elements/values, and I will return an element”
XML Query Data Model Example: book1= elemNode(book, {price2, currency3}, [title4, author5, author6, author7, year8]) price2 = attrNode(…) /* next */currency3 = attrNode(…)title4 = elemNode(title, string9)… <bookprice = “55” currency = “USD”> <title> Foundations … </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <year> 1995 </year> </book>