1 / 25

OrientX: A Native XML Database System

OrientX: A Native XML Database System. XML Group. Outline. Preliminaries Architecture and Features Storage management Achievement Conclusion and Future Work. Outline. Preliminaries Architecture and Features Storage management Achievement Conclusion and Future Work. Legend:. <bib>

Download Presentation

OrientX: A Native XML Database System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OrientX: A Native XML Database System XML Group

  2. Outline • Preliminaries • Architecture and Features • Storage management • Achievement • Conclusion andFuture Work

  3. Outline • Preliminaries • Architecture and Features • Storage management • Achievement • Conclusion andFuture Work

  4. Legend: <bib> <vendor> <name>LongMark</name> <book isbn="isbn1001"> <title>C++</title> <author> <fname>Rose</fname> <lname>Smith</lname> </author> <price>50</price> </book> <book isbn="isbn1002"> <title>XML</title> <author> <fname>Steven</fname> <lname>Tom</lname> </author> <price>80</price> </book> </vendor> <bib> element node bib text node vendor book book name price price author author title title lname lname fname fname LongMark 50 80 XML C++ Steven Rose Smith Tom 图1 XML 文档和文档树 XML • XML文档和文档树

  5. Legend: element node text node book book price price author author title title lname lname fname fname 80 50 C++ XML Rose Steven Smith Tom XPath&XQuery • XPath XPath is a language for addressing parts of an XML document. bib /bib/vender/book //book bib//book //@lang /bib/vendor/book[last()] //book[price>50] //book/title | //book/price bib vendor name LongMark

  6. XPath&XQuery • XQuery FLWOR "For, Let, Where, Order by, Return" for $x in doc(“bib.xml")/bib/vendor/book where $x/price>30 order by $x/title return $x/title <bib> <vendor> <name>LongMark</name> <book isbn="isbn1001"> <title>C++</title> <author> <fname>Rose</fname> <lname>Smith</lname> </author> <price>50</price> </book> <book isbn="isbn1002"> <title>XML</title> <author> <fname>Steven</fname> <lname>Tom</lname> </author> <price>80</price> </book> </vendor> <bib> bib.xml

  7. XQuery/Update • Insert • Delete • Replace • Replacing a Node • Replacing the Value of a Node • Rename • Transform

  8. Outline • Introduction of XML • Architecture and Features • Storage management • Achievement • Conclusion andFuture Work

  9. Introduction of OrientX • OrientX means: Original RUCIDKENative XML Database • RUC: Renmin University of China • IDKE: Institute of Data and Knowledge Engineering • Native XML DataBase: Exposing a logical model of storing and retrieving XML documents. (non Native XML DataBase: for example, based on relation database)

  10. System Architecture OrientX3.0 system Architecture

  11. Features • Full support to XML Schema • Supporting XQuery1.0, XPath2.0 XQuery/Update (except transform) • A set of programming API • Various native storage techniques • Multi-Query Processing strategies based on native storage. • Friendly UI (java-based)

  12. History • OrientX1.0 (2002-2003) • OrientXStore, schema manager, document importing and exporting. • OrientX1.5 (2003-2004) • Execute XPath, xml numbering, and index manger. • OrientX2.0 (2004-2005) • XQuery Execute Engine based on Navigation • OrientX2.5 (2005-2006) • XQuery Execute Engine based on XML Algebra. • OrientX3.0 (2006-2007) • XQuery/Update: insert, delete, etc.

  13. Outline • Preliminaries • Architecture and Features • Storage management • Achievement • Conclusion andFuture Work

  14. Different storage granularities • Document: • do not decompose the document, build index on it to direct the structure. • Query complexity and efficiency are restricted by the power of index. • Sub tree: • decompose the document into sub trees according to storage space partition. • Persistent the structure in the tree. • save space • Node: • decompose the document into nodes sequence , each node corresponding to a type (element, attribute, …). • May use too many links to persistent relation between nodes

  15. Storage Techniques in OrientX Like DEB, the storing order is depth-first, but each record is a sub-tree. The size of sub tree is close to physical page size One Element is a record, in deep-first order tree One element is a record, but all elements with the same tag name will be clustered-stored. similar to DSB, each record is a sub tree. But all sub trees with the same structure are clustered store. Implemented techniques are marked in red

  16. Example-- Element based • DEB • CEB r t1 a1 a2 r t1 l1 f1 a1 l2 f2 a2 l1 f1 l2 f2 Source doc r t1 l1 l2 f1 f2 a1 a2

  17. Example-- Subtree based r Proxy node (virtual node) t1 a1 a2 f2 l1 f1 l2 Also have Proxy node DOC r r t1 a1 a2 t1 a1 a2 l1 f1 l2 f2 l1 f1 l2 f2 DSB(Depth-first sub-tree based) CSB (clustered sub-tree based)

  18. OrientX3.0 Demo

  19. Outline • Preliminaries • Architecture and Features • Storage management • Achievement • Conclusion andFuture Work

  20. W3C的收录 • 开发了国内首个XML 数据库原型系统OrientX,并于2004年5月在W3C网站上发布,受到国际同行的认可。

  21. 同行的高度评价 • 2006年11月德国“Dagstuhl Seminar on XQuery Implementation Paradigms”会议主席在邀请函中特别指出OrientX 系统的研究对于本领域的研究工作来说是一个重要贡献(Your native XML database system OrientX is clearly recognized as a highly significant contribution in this research area)。 • 希腊国家技术大学的Timos Sellis教授(著名数据库专家,R+树发明人)主动与我们联系进行合作研究并得到了科技部中希国际合作交流项目“基于context的XML数据管理研究”的支持。

  22. 发表的论文 • 由于XML文档本身的结构特性,使得在应用关系系统管理XML数据的时候面临着数据冗余、查询效率低下等问题。对此,我们以构建Native XML 数据库为目标,从XML数据的存储、编码、索引、查询代数及优化等方面进行了系统深入的研究,在VLDB2003、SIGMOD2004、ICDE2005、DASFAA2003、WWWJ、软件学报等会议和刊物上均有论文发表,被来自ICDE2007、VLDB2007、WWW2007、VLDB2005、IEEE Internet Computing9(2)、SIGMOD WebDB、CIKM2005、EDBT2005、DKE2005等国际会议和刊物引用39次

  23. Outline • Preliminaries • Architecture and Features • Storage management • Achievement • Conclusion and Future Work

  24. Conclusion and Future Work • Conclusion: • OrientX is an integrated, schema-based native XML database system. • It implements storing and querying xml data. • Future work: • XQuery/Update: transform • Further implementation of XML algebra query engine.

  25. Thanks! Q&A

More Related