90 likes | 102 Views
This paper explores schema-based XML storage and query translation algorithms, focusing on recursive XML schemas and queries. It discusses mapping-aware query translation algorithms and the interaction between relational decomposition and query translation.
E N D
Processing XML data using a relational database: Schema-Based XML Storage By Khang Nguyen Based on the paper of Rajasekar Krishnamurthy
Three main points on the query translation problem • Developing query translation algorithms for the case when the XML Schema and/or the XML query may be recursive. • Designing algorithms that make better use of the XML-to-Relational mapping information during the query translation process. • Studying the interaction between the two sub problems: choosing a good relational decomposition for storing the XML data and choosing a query translation algorithm.
Recursive Schemas and Recursive Queries • Has been a lot of work on alternative relational decompositions for XML data, not much on query translation algorithms. • [Choi02] out of 60 XML schemas analyzed, 35 were recursive. Recursive XML schemas are important. • Descendant operator (//) specifies ancestor-descendant relationships. • i.e., the query //section/title is a recursive query.
Recursive Schemas and Recursive Queries (Cont.) – Interesting Issues • How do we translate path expression queries over arbitrary XML-to-Relational mappings into equivalent SQL queries? • Is the support for recursion in SQL3 sufficient for supporting path expression queries over arbitrary XML-to-Relational mapping? • Are there any issues in the translation process when the XML schema is non-recursive? • Does XPath semantics introduce any interesting challenges?
Mapping-aware Query Translation Algorithm (Cont.) • Query: retrieve all the top-level section titles. • XQuery: • for $title in document(*)/book/section/title • SQL query: • Select S.title • From Book B, Section S • Where B.id = S.parentid and S.parentcode = 1 • Mapping-aware algorithm query: • Select title • From Section • Where parentcode = 1
Are the two sub problems independent? • One is to pick a good relational decomposition and the other is to translate queries over this XML-to-Relational mapping. • The two sub problems can’t be solved in isolation. • There exist query translation algorithms T1 and T2, and relational decomposition D1 and D2. If we use T1, then D1 is better than D2 while with T2, then D2 is better than D1.
Yes, the two sub problems are dependent (Cont.) • On the 100MB XMark dataset [11], we noticed that XQ2fg was about three times faster than XQ2fp. • So, we see that for query Q, with algorithm NaiveTranslation, the fully partitioned strategy is better, whereas with algorithm MultipleScan, the fully grouped strategy is better. • As a result, the quality of a decomposition is closely related to the query translation algorithm used.