200 likes | 371 Views
XML-to-Relational Data Mapping Algorithms OXInsert and SDM. Speaker: Artem Chebotko* Email: artem@wayne.edu Wayne State University *Joint work with Mustafa Atay, Shiyong Lu and Farshad Fotouhi. Introduction.
E N D
XML-to-Relational Data Mapping Algorithms OXInsert and SDM Speaker: Artem Chebotko* Email: artem@wayne.edu Wayne State University *Joint work with Mustafa Atay, Shiyong Lu and Farshad Fotouhi
Introduction • XML has emerged as the standard for representing and exchanging data on the World Wide Web. • The increasing amount of XML documents requires the need to store and query XML documents efficiently.
Current approaches of storing and querying XML documents • Native XML repositories, e.g., Software AG’s Tamino, eXcelon’s XIS. • XML-enabled commercial database systems such as SQL Server, Oracle, and DB2 • Using RDBMS/ODBMS to store and query XML documents.
Issues of the relational approach • Schema Mapping • XML data model needs to be mapped into the relational model • Data Mapping • XML documents need to be shredded and composed into tuples to be inserted into the relational database • Query Mapping • XML queries need to be translated into SQL queries • Reverse Data Mapping • Query results need to be tagged to XML format.
Our contributions • We propose an efficient DOM-based linear data mapping algorithm, OXInsert, which shreds and composes input XML documents into relational tuples and inserts them into the relational database according to the schema generated by ODTDMap. • We propose an efficient and linear SAX-based data mapping algorithm, SDM, which shreds and composes ordered XML documents into relational tuples and inserts them into the relational database according to the schema generated by ODTDMap.
Outline of the talk • Main issues for data mapping • Data mapping algorithm OXInsert • Data mapping algorithm SDM • Complete example • Conclusions and future work
Main issues for data mapping • Varying document structure. XML documents have varying structures due to the optional occurrence operators `?', `*', and choice operator `|' used in the underlying DTD, unlike relational tables which always have a fixed structure. • Scalability. • Preserve document order.
XML Tree • Definition 4.1 (XML Tree) We model an XML document D as an XML element tree (XML Tree) T, in which nodes represent XML elements and edges represent parent-child relationships between XML elements. The XML Tree T is an ordered tree and its nodes can have attributes and values associated with them. The root of XML Tree T is denoted by T.root.
XML Tree (con’t) • For each element node e in T, we use the following notations:
OXInsert time complexity • Lemma 4.2Each non-inlinable element e in XML Tree T is enqueued intoQueue q exactly once, and q only contains non-inlinable elements. • Lemma 4.3Each XML element e, except the root element in XML Tree T isenqueued into queue r exactly once. • Theorem 4.4 (Time Complexity)The time complexity of algorithm OXInsertis O(n), where DTD Graph G, Relational Schema R and Schema Mapping sare fixed and n is the total number of XML elements and attribute values inXML Tree T.
SDM time complexity • Theorem 4.6 (Time complexity)The time complexity of algorithm SDM isO(n) where n is the number of elements and attribute values in the input XMLdocument.
Conclusions • We identified several challenging issues for the data mapping problem and proposed two linear data mapping algorithms, OXInsert and SDM, based on two well-known XML parsers DOM and SAX, respectively. • We compared their performance. Experimental studies showed that these algorithms are efficient and well scalable with respect to the size of input documents.
Future work • Considering semantic integrity constraints during data mapping needs to be investigated