280 likes | 458 Views
XML DOCUMENTS AND DATABASES. Introduction. This part is about how XML Documents can be stored and retrieved. Approaches to Storing XML Documents.
E N D
Introduction • This part is about how XML Documents can be stored and retrieved.
Approaches to Storing XML Documents • Several approaches to organizing the contents of XML documents to facilitate their subsequent querying and retrieval have been proposed. The following are the most common approaches.
Using a DBMS to store the documents as text • A relation or object DBMS can be used to store whole XML documents as text fields within the DBMS records or objects. • This approaches cen be used if the DBMS has a special module for document processing, and would work for storing schemaless and document- centric XML documents.
Using a DBMS to store the documents as text • The keyword indexing functions of the document processing module can be used to index and speed up search and retrieval of the documents
Using a DBMS to store the document contents as data elements • This approach would work for storing a collection of documents that follow a specific XML DTD or XML schema. • Because all the documents have the same structure, one can design a relational database to store the leaf data elements within the XML documents.
Using a DBMS to store the document contents as data elements • To be able to handle this it requires mapping algorithmsto design a database schema that is compatible with the XML document structure as specified in the XML schema to recreate the XML documents from the stored data.
Designing a speciliazed system for storing native XML data • A new type of database system based on the tree model could be desinged and implemented. • The system would include specialized indexing and would work for all types of XML documents. • By the way you can use compression techniques in this approach.
Creating and publishing customized XML documents from preexisting relational databases • There are enormous amounts of data already stored in relational databases. • Parts of this data may need to be formatted as documents for exchanging or displaying over the Web
Creating and publishing customized XML documents from preexisting relational databases • This approach would use a seperate middleware software layer to handle the conversions needed between the XML documents and teh relational database.
Extracting XML Documents from Relational Databases • In this part we focused on the last approach. • XML uses tree model to represent documents. When we add referential integrity constraints, a relational schema can be considered to be a graph structure. • Similarly also the ER model represents data using graphlike structures • There would be straightforward mappings between the ER and relational models.
Extracting XML Documents from Relational Databases • So we can conceptually represent a relational database schema using the corresping ER schema. • Namely the issue is clarifying differences between tree and graph models. If we solve this also solve the problem of converting relational data to XML.
Example • Objective: Creating application to extract XML documents for student course and grade information from University DB.
Example • To be able to do this we need COURSE, SECTION and STUDENT attributes.
Example • When we convert it to tree model we must select root. So in this example we can choose three different root.
Breaking Cycles to Convert Graphs into Trees • It is possible to have more complex subset with one or more cycles, indicating multiple relationships among entities • In this case it is more complex to decide how to create the document hierarchies. • Additional duplication of entities may be needed to represent the multiple relationships.
Breaking Cycles to Convert Graphs into Trees • The way is to replicate the entity types involved in the cycles.
Other Steps for Extracting XML Documents from Databases • It is necessary to create the correct query in SQL to extract the desired information for the XML document • Once the query is executed, its result must be structured from the flat relational form to the XML tree structure.