200 likes | 344 Views
XML-to-SQL Query Mapping in the Presence of Multi-valued Schema Mappings and Recursive XML Schemas. Speaker: Artem Chebotko Mustafa Atay, Artem Chebotko, Shiyong Lu, and Farshad Fotouhi Department of Computer Science Wayne State University Detroit, MI 48202 USA
E N D
XML-to-SQL Query Mapping in the Presence of Multi-valued Schema Mappings and Recursive XML Schemas Speaker: Artem Chebotko Mustafa Atay, Artem Chebotko, Shiyong Lu, and Farshad Fotouhi Department of Computer Science Wayne State University Detroit, MI 48202 USA {matay, artem, shiyong, fotouhi}@wayne.edu
Outline of Talk • Motivation • XML-to-Relational Mappings • Recursion in Query Mapping • Problem Statement and Contributions • Our Proposed Solution • Path-based XML-to-Relational Mapping • Unfolded XML Schema Graph • Proposed generic query mapping algorithm • Conclusions and Future Work DEXA'07, Regensburg, Germany
XML-to-Relational Mappings • Single-valued • Each XML element type is mapped to exactly one relation • e.g., Shared, Shanmugasundaram et al., VLDB, 1999 • e.g., ODTDMap, Atay et al., Information Systems, 2007 • Multi-valued • Each XML element type can be mapped to multiple relations • e.g., Basic and Hybrid, Shanmugasundaram et al., VLDB, 1999 DEXA'07, Regensburg, Germany
Single-valued vs. Multi-valued DEXA'07, Regensburg, Germany
Single-valued vs. Multi-valued Motivating Example • XPath expression • /A/B1/C/D3 • SQL query • Select T4.ID From s(A) T1, s(B1) T2, s(C) T3, s(D3) T4 Where T1.ID=T2.parentID And T2.ID=T3.parentID And T3.ID=T4.parentID DEXA'07, Regensburg, Germany
Recursion in Query Mapping • Challenge: When there is recursion both in an XML query and in its underlying schema, there might be infinitely many matching paths • Solutions • Krishnamurthy et al., ICDE, 2004 • requires with construct of SQL’99 • Fan et al., VLDB, 2005 • requires LFP (least fixpoint operator) DEXA'07, Regensburg, Germany
Problem Statement • Existing query mapping algorithms only support single-valued mappings • A query mapping algorithm supporting multi-valued mappings is missing • Existing query mapping algorithms need special operators to deal with recursion • There is a need for a query mapping algorithm which can be implemented for any RDBMS DEXA'07, Regensburg, Germany
Contributions • We propose a generic query mapping algorithm which supports both single-valued and multi-valued mappings • Our proposed algorithm only requires the traditional relational operators to handle the recursion • It can be implemented in any RDBMS DEXA'07, Regensburg, Germany
Our Proposed Solution • Path-based XML-to-Relational Mapping • sp-Mapping • supports multi-valued mappings (i) • Unfolded XML Schema Graph • UXG • helps identifying finite number of paths for a given recursive query (ii) • Proposed generic query mapping algorithm • ID-XMLtoSQL DEXA'07, Regensburg, Germany
sp-Mapping • Provides solution to the problem of supporting multi-valued mappings • Combines the followings to find a mapping • XML-to-Relational Mapping (s-Mapping) • Path structure of input XML query p=e1/e2/…/en DEXA'07, Regensburg, Germany
sp-Mapping Example /A/B1/C/D3 • SQL query • Select T4.ID From s(A) T1, s(B1) T2, s(C) T3, s(D3) T4 Where T1.ID=T2.parentID And T2.ID=T3.parentID And T3.ID=T4.parentID DEXA'07, Regensburg, Germany
UXG (Unfolded XML Schema Graph) • UXG provides solution to the problem of finding a finite number of matching paths for a recursive XML query • We convert a cyclic XML schema graph to a directed acyclic graph by unfolding the cycles in the original graph (UXG) • Static and dynamic approaches • UXG always guarantees a finite number of matching paths for an arbitrary XML query DEXA'07, Regensburg, Germany
UXG Example DEXA'07, Regensburg, Germany
Algorithm ID-XMLtoSQL • Given a path expression P and UXG Gu • Extract all matching paths • Identify the sp-Mappings for each pi • Call SPathToSQL() to generate an SQL query for each pi • Get the union of output SQL queries DEXA'07, Regensburg, Germany
Clustering • Cluster is the set of consecutive elements in path expression which are mapped to the same relation • We use the notion of a cluster for optimizing the output SQL query in SPathToSQL() • e.g., /A/B1/C/D3/E A B1 B1 B1 E c1 c2 c3 DEXA'07, Regensburg, Germany
Algorithm SPathToSQL DEXA'07, Regensburg, Germany
/A/D3//E is given extracted paths /A/D3/E /A/D3/E/D1/E identified sp-Mappings {(A,A), (D3,A), (E,E)} {(A,A), (D3,A), (E,E), (D1,D1),(E,E)} Output SQL query Select E.ID From A, E Where A.D3.ID=E.parentID UNION ALL Select E.ID From A, E T1, D1, E T2 Where A.D3.ID=T1.parentID And T1.ID=D1.parentID And D1.ID=T2.parentID Example DEXA'07, Regensburg, Germany
Performance Study • We compared our ID-XMLtoSQL to SQLGen of Krishnamurthy et al., ICDE, 2004 • We selected 9 queries from the XMark benchmark • ID-XMLtoSQL outperformed SQLGen in all the test queries DEXA'07, Regensburg, Germany
Conclusions and Future Work • We proposed a generic query mapping algorithm for a schema-based relational XML storage • We proposed an efficient way of handling recursion in query mapping which can be applicable to all RDBMSs • We consider augmenting our proposed ID-based algorithm with interval-based and path-based mapping schemes as a potential future work DEXA'07, Regensburg, Germany
Thank you! Questions?