160 likes | 285 Views
Stefan Manegold Centrum voor Wiskunde en Informatica Amsterdam http://monetdb.cwi.nl/ - http://pathfinder-xquery.org/. MonetDB/XQuery Technology Preview 1. Stefan Manegold. MonetDB/XQuery. HollandOpen, Amsterdam 31.5.2005. European Pathfinder Team. CWI, Amsterdam (Netherlands)
E N D
Stefan Manegold Centrum voor Wiskunde en Informatica Amsterdam http://monetdb.cwi.nl/ - http://pathfinder-xquery.org/ MonetDB/XQueryTechnology Preview 1
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 European Pathfinder Team • CWI, Amsterdam (Netherlands) • Peter Boncz, Stefan Manegold, Sjoerd Mullender • University of Twente (Netherlands) • Maurice van Keulen, Jan Flokstra • University of Konstanz (Germany) • Torsten Grust, Jens Teubner, Jan Rittinger
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Results: Performance (1) XMark benchmark, 110 MB: MonetDB/XQuery vs. X-Hive & Galax
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Results: Performance (2) XMark benchmark, 1.1 GB: MonetDB/XQuery vs. X-Hive
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Story • XQuery Example • Relational XQuery • System Architecture • XML Encoding • Science & Reseach • Scalability • Outlook • Conclusions • Roadmaps • Release & References
XQuery Example Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 • For each author, return number of books and receipts • for books published in the past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml”), (:Documents:) $sales := fn:doc(“www.publishersweekly.com/sales.xml”) for $author in distinct-values($cat//author) (:Grouping:) let $books := $cat//book[@year >= 2003 and author = $author],(:Sel.:) $receipts := $sales/book[@isbn = $books/@isbn]/receipts (:Join:) order by $author (:Ordering:) return <sales> (:XML Construction:) { $author } <count> { fn:count($books) } </count> (:Aggregation:) <total> { fn:sum($receipts) } </total> </sales>
XQuery Example Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 • For each author, return number of books and receipts • for books published in the past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml”), Documents $sales := fn:doc(“www.publishersweekly.com/sales.xml”) for $author in distinct-values($cat//author) Grouping let $books := $cat//book[@year >= 2003 and author = $author],Sel. $receipts := $sales/book[@isbn = $books/@isbn]/receipts Join order by $author Ordering return <sales> XML Construction { $author } <count> { fn:count($books) } </count> Aggregation <total> { fn:sum($receipts) } </total> </sales>
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 XQuery Systems: 2 Approaches • Existing “native” XML/XQuery systems are built from scratch • Galax, Saxon, … • X-Hive, Tamino, … • (Still have to) re-invent optimization technology • Our approach: • Build XQuery system on top of an RDBMS • Leverage mature relational technology to achieve efficient XQuery processing
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Architecture
Node-based relational encoding of XQuery's data model Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 XML in an RDBMS: XPath Accelerator xx <a> <b> <c/> </b> <d/> <e> <f> <g/> <h/> </f> <i> <j/> </i> </e> </a> 0<a> 1<b> 2<c/> </b> 3<d/> 4<e> 5<f> 6<g/> 7<h/> </f> 8<i> 9<j/> </i> </e> </a> 0<a> 1<b> 2<c/>0 </b>1 3<d/>2 4<e> 5<f> 6<g/>3 7<h/>4 </f>5 8<i> 9<j/>6 </i>7 </e>8 </a>9 • f/following: SELECT * FROM pre_post WHERE pre > f.pre AND post > f.post • f/descendant: SELECT * FROM pre_post WHERE pre > f.pre AND post < f.post • f/preceeding: SELECT * FROM pre_post WHERE pre < f.pre AND post < f.post • f/ancester: SELECT * FROM pre_post WHERE pre < f.pre AND post > f.post Similar queries for all 13 XPath axes
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Science & Research • More research lead to more optimization • Join Recognition • Embedded XPath processing • Order Awareness • Various scientific publications
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Results: Scalability (3) Unsurpassed scalability • Standard Opteron PC, 8GB RAM, 64-bit Linux • Can process 11GB documents! Mostly linear scaling with document size
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Conclusions • Relational approach • Works • Is fast • Is scalable • Crucial Optimizations • Join recognition • Embedded XPath processing • Order awareness • Research turned into open-source release
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Roadmap • 30.05.05: MonetDB/XQuery 4.8/0.8 “Mercurius” • Developers Release / Technology Preview 1 • 30.09.05: MonetDB/XQuery 4.10/0.10 “Venus” • Student Release / Technology Preview 2 • XUpdate, More Optimization • 30.12.05: MonetDB/XQuery 4.12/1.12 “Mars” • Final Release • Application Programming Interfaces • End-User Front-Ends
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31.5.2005 Open Source Release & References • MonetDB + Pathfinder on SourceForge • Mozilla-like License • MonetDB homepage • http://monetdb.cwi.nl/ • Pathfinder homepage • http://pathfinder-xquery.org/ • Developers website • http://sf.net/projects/monetdb/ You are welcome to join the MonetDB/XQuery team!
Stefan Manegold MonetDB/XQuery HollandOpen, Amsterdam 31-5-2005 Results: Performance (4) XMark performance in seconds: MonetDB/XQuery vs. Galax & X-Hive