400 likes | 530 Views
XML Structures for Relational Data. Wenyue Du, Mong Li Lee, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore {duwenyue, leeml, lingtw}@comp.nus.edu.sg. Contents. Introduction Motivation Related Works Our Approach Background XML
E N D
XML Structures for Relational Data Wenyue Du, Mong Li Lee, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore {duwenyue, leeml, lingtw}@comp.nus.edu.sg
Contents • Introduction • Motivation • Related Works • Our Approach • Background • XML • XML DTD • Semantic Enrichment • Proposed Relational to XML Translation • Comparison • Conclusion
1. Introduction • Outline • Motivation • Related Works • Our Approach
Introduction Motivation • XML is emerging as a standard for information publishing on the World Wide Web. However, the underlying data is often stored in traditional relational databases. Some mechanism is needed to translate the relational data into XML data.
Introduction Motivation (cont.) • Generates XML structures that are able to describe the semantics and structures in underlying relational databases. • Obtains properly structured XML data without unnecessary redundancies and proliferation of disconnected XML elements.
Introduction Related Works • [1, 5, 6] basically focus on single relation translation. In order to handle a set of related relations, the relations are first denormalized to one single relation. • The flat XML structure does not provide a good way to show the structure of data. • It causes a lot of redundancies. <!ELEMENT Results(Employee*)> <!ELEMENT Employee (EMPTY)> <!ATTLIST Employee E# CDATA #REQUIRED Ename CDATA #IMPLIED JoinDate CDATA #IMPLIED D# CDATA #REQUIRED DNAME CDATA #IMPLIED > Relations: Dept(D#, Dname) Employee (E#, Ename, JoinDate, D#) Maps to
Introduction Related Works (cont.) • [7] developed a method to generate a hierarchical DTD for XML data from a relational schema. • It lacks of semantic enrichment. So it cannot handle more complex situations. Relations: Dept (D#, Dname) Employee (E#, Ename, JoinDate, D#) <!ELEMENT Results(Employee*)> <!ELEMENT Employee (Dept)> <!ATTLIST Employee E# ID #REQUIRED Ename CDATA #IMPLIED JoinDate CDATA #IMPLIED> <!ELEMENT Dept (EMPTY)> <!ATTLIST Dept … > Maps to Is it an attribute of object or relationship?
Introduction Our Approach XML structures for relational data can be obtained by the following steps:
2. Background Outline • XML • XML Schema • Semantic Enrichment
Background / XML XML • Basic constructs of XML: • Element • Attribute • Reference (link) : a relationship between resources (e.g. elements). It is specified by attaching specific attributes or sub-elements.
Background / XML DTD XML DTD A Document Type Definition (DTD) describes structure on an XML document. <RESULTS> <CUSTOMER CID=“C980054Z"> <CNAME>J. Tan</CNAME> <AGE>36</AGE> </CUSTOMER> … </RESULTS> <!ELEMENT RESULTS (CUSTOMER*)> <!ELEMENT CUSTOMER (CNAME, AGE)> <!ATTLIST CUSTOMER CID ID #REQUIRED> <!ELEMENT CNAME (#PCDATA)> <!ELEMENT AGE (#PCDATA)> XML document Corresponding DTD
Background / Semantic Enrichment Semantic Enrichment • Semantic enrichment is a process that upgrades the semantics of databases, in order to explicitly express semantics that is implicit in the data. Such as various relationship types, cardinality constraints, etc.
Background / Semantic Enrichment Extra information needed: • Functional Dependencies (FDs) and keys • Inclusion dependencies (INDs) e.g. STUDENT (S#, SNAME) HOBBIES(S#, HOBBY) HOBBIES[S#] STUDENT[S#] • Semantic dependencies (SDs) (T.W. Ling & M.L. Lee, 1995)
Background / Semantic Enrichment Semantic Dependencies EMPLOYEE(E#, ENAME, JOINDATE, D#) • JOINDATE is functionally dependent on only E# • Assuming JOINDATE refers to the date on which an employee assumes duty with the department. We say that JOINDATE is semantically dependent on {E#, D#}
Background / Semantic Enrichment Semantic Enrichment using SD together with FD and IND To obtain: • Object relations and object attributes that represent regular and weak entity types, and their properties. • Relationship relations and relationship attributes that represent various relationship types such as binary, n-ary, recursive and ISA (inheritance), and their properties. • Mix-type relations: We need to split them into object relations and relationship relations • Fragments of object relations or relationship relations that represent multi-valued attributes of entity types or relationship types. • Cardinality constraints
Background / Semantic Enrichment An Original Relational Schema COURSE (CODE, TITLE) DEPT (D#, DNAME) STUDENT (S#, SNAME) TUTORIAL (T#, TUTORIALTITLE) HOBBIES(S#, HOBBY) STUDENTDEPT (S#, D#) C_S (CODE, S#, GRADE) ATTEND (CODE, T#, S#) COURSEMEETING (CODE, S#,MEETINGHISTORY)
Background / Semantic Enrichment The Semantically Enriched Schema Object Relations: COURSE (CODE, TITLE) DEPT (D#, DNAME) STUDENT (S#, SNAME) TUTORIAL (T#, TUTORIALTITLE) Fragment of Object Relations HOBBIES(S#, HOBBY) Relationship Relations: STUDENTDEPT (S#, D#) C_S (CODE, S#,GRADE) ATTEND (CODE, T#, S#) Fragment of Relationship Relations COURSEMEETING (CODE, S#,MEETINGHISTORY) fragment of C_S
3. Proposed Relational to XML Translation Outline • ORA-SS Model • Relational Schema to ORA-SS Translation • ORA-SS to XML Schema Translation
Proposed Relational to XML Translation / ORA-SS ORA-SS Model ORA-SS (Object-Relationship-Attribute model for Semi-Structured data) G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, “ORA-SS: An Object-Relationship-Attribute Model for Semi-structured Data”, TR 21/00, National Univ. of Singapore, 2001
Proposed Relational to XML Translation / ORA-SS Concepts of ORA-SS (cont.) Object class Binary relationship Ternary relationship Reference Identifier Relationship attribute
Enriched Relational Schema to ORA-SS Schema Translation Enriched Relational Schema to ORA-SS Schema Translation Objectives: • Identify object classes and their attributes from object relations • Identify relationship types and their attributes from relationship relations • Identify hierarchical structure • Generate ORA-SS schema
Enriched Relational Schema to ORA-SS Schema Translation Overview of Translation Rules • Object relation rules: to translate object relations • Relationship relation rules: to translate relationship relations • Combination rule: to beapplied to the result obtained from the application of object and relationship relation rules, and generate the final ORA-SS schema.
Enriched Relational Schema to ORA-SS Schema Translation /Object Relation Translation Rules Rule O1: Mapping object relations STUDENT(S#, SNAME) Maps to Single-valued attribute
Enriched Relational Schema to ORA-SS Schema Translation /Object Relation Translation Rules Rule O2: Mapping fragment of object relations STUDENT(S#, SNAME) HOBBIES(S#,HOBBY) Maps to Multivalued attribute
Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules Rule R1: Mapping 1-m/1-1 relationship relation Objectives: • Reduce disconnected elements Use parent-child structure • Avoid unnecessary redundancies Use references Example: ADVISOR(STAFF#, POSITION) // object relation STUDENT(S#, SNAME) // object relation STU_ADV(S#, STAFF#) //1-m relationship relation
Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules Rule R1: Mapping 1-m/1-1 relationship relation (cont.) Case 1: All the objects (instances) of STUDENT participate in the relationship type STU_ADV ADVISOR STU_ADV 2,0:n,1:1 STU_ADV Maps to STUDENT Use parent-child structure
Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules Rule R1: Mapping 1-m/1-1 relationship relation (cont.) Case 2: • Not all the objects of STUDENT participate in STU_ADV. • STUDENT is already as a child object and all the objects of ADVISOR participate in STU_ADV . or STUDENT STU_ADV 2,0:1,1:n STU_ADV Maps to ADVISOR Use parent-child structure
Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules Rule R1: Mapping 1-m/1-1 relationship relation (cont.) Case 3: There exist objects of STUDENT and ADVISOR do not participate in STU_ADV STUDENT ADVISOR ADVISOR STUDENT Maps to STU_ADV 2,*,? STU_ADV 2,*,? or STU_ADV A_Ref S_Ref ADVISOR1 STUDENT1 Use reference
Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules Rule R2: Mapping m-n binary relationship relation Three ways to map: COURSE(CODE, TITLE) C_S(S#, CODE, GRADE) STUDENT (S#, SNAME) Preferred Mapping
Enriched Relational Schema to ORA-SS Schema Translation /Relationship Relation Translation Rules Other relationship relation rules • Fragment of relationship relation is translated similarly to the translation of the fragment of object relation. • N-ary relationship relation is translated using reference structures. The level of each referencing object may be determined by the aggregations. • If BISAA, then B is mapped to a child object class (OB) of OA.
Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule Combination Rule: • to beapplied to the result obtained from the application of object and relationship relation rules, and generate the final ORA-SS schema. Example: PERSON(SSNO, RACE) //object relation STUDENT(S#, SSNO, MAJOR) //object relation DEPT(D#, DNAME) //object relation STU_DEPT(S#, D#) //relationship relation STUDENT ISA PERSON and one DEPT has many STUDENT. In this case, STUDENT potentially has multiple parents (i.e., DEPT and PERSON).
Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule Combination Rule: Current solution: • Use references (K. Williams, et al.January 2001) -- It causes too many disconnected elements. <!ELEMENT Results (PERSON*, STUDENTS* DEPT*)> <!ELEMENT PERSON (EMPTY)> <!ATTLIST PERSON SSNO ID #REQUIRED RACE CDATA #IMPLIED STU_REF1 IDREF #REQUIRED> <!ELEMENT STUDENT (EMPTY)> <!ATTLIST STUDENT S# ID #REQUIRED MAJOR CDATA #IMPLIED > <!ELEMENT DEPT (EMPTY)> <!ATTLIST DEPT D# ID #REQUIRED DNAME CDATA #IMPLIED STU_REF2 IDREFS #REQUIRED>
Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule Combination Rule: (cont.) Our approach: • Translations are produced sequentially according to their priorities. • The translation with the lowest priority will be carried out last. The priorities of translations (in descending order) ISA, etc. semantic relationship relations and their fragments // high semantic cohesion among these participating object classes 1-1 and 1-m relationship relation and their fragments // potentially represented as hierarchy (p-c) structure m-1 relationship relations and their fragments // potentially represented as hierarchy structure; preferably view as 1-m m-n, n-ary relationship relations and their fragments This rule is used to avoid or reduce potential multiple parents.
Enriched Relational Schema to ORA-SS Schema Translation /Combination Rule Combination Rule: (cont.) We map STUDENT to the child object class of PERSON first. Then map DEPT according to 1-m relationship relation rule. Thus, we may get the following result. <!ELEMENT OurSolution (PERSON*, DEPT*)> <!ELEMENT PERSON (STUDENT)> <!ATTLIST PERSON SSNO ID #REQUIRED RACE CDATA #IMPLIED > <!ELEMENT STUDENT (EMPTY)> <!ATTLIST STUDENT S# ID #REQUIRED MAJOR CDATA #IMPLIED > <!ELEMENT DEPT (EMPTY)> <!ATTLIST DEPT D# ID #REQUIRED DNAME CDATA #IMPLIED D_S_REF IDREFS #REQUIRED>
Enriched Relational Schema to ORA-SS Schema Translation A possible ORA-SS Schema diagram derived from university database Object Relations: COURSE (CODE, TITLE) DEPT (D#, DNAME) STUDENT (S#, SNAME) TUTORIAL (T#, TUTORIALTITLE) Fragment of Object Relations HOBBIES(S#, HOBBY) Relationship Relations: STUDENTDEPT (S#, D#) C_S (CODE, S#,GRADE) ATTEND (CODE, T#, S#) Fragment of Relationship Relations COURSEMEETING (CODE, S#,MEETINGHISTORY) fragment of C_S
Input: an ORA-SS schema diagram SDOutput: an XML DTDBegin Start from the top of SD and proceed downward, for each object class O encountered do:Step 1. Sub-object classes of O <!ELEMENT O (subelementsList)>Step 2. For each attribute A of O Case (1) A is a single valued simple attribute <!ATTLIST OA type> Case (2) A is a single valued composite attribute, replace A with its components and add to <!ATTLIST O attributename type> Case (3) A is a multivalued simple attribute <!ELEMENT A(#PCDATA)> Case (4) A is a multivalued composite attribute <!ELEMENT A(EMPTY)>A’s components <!ATTLIST A componentName type>Step 3. For each relationship attribute A under O, add A to subelementsList in <!ELEMENT O(subelementsList)>. Case (1) A is a simple attribute <!ELEMENT A(#PCDATA)>. Case (2) A is a composite attribute <!ELEMENT A(EMPTY)>,A’s components <!ATTLIST A componentName type> Algorithm: Mapping ORA-SS Schema Diagram to XML DTD
Algorithm: Mapping ORA-SS Schema Diagram to XML DTD <!ELEMENT UNIVERSITY (COURSE*, STUDENT*, DEPT*, TUTORIAL*)><!ELEMENT COURSE (STUDENT1*)> <!ATTLIST COURSE CODE ID #REQUIRED TITLE CDATA #IMPLIED> <!ELEMENT STUDENT1 (MEETINGHIS*,TUTORIAL1*)> <!ATTLIST STUDENT1 C_S_REF IDREF #REQUIRED GRADE CDATA #IMPLIED> <!ELEMENT MEETINGHIS (#PCDATA)> <!ELEMENT TUTORIAL1 (EMPTY)> <!ATTLIST TUTORIAL1T_REF IDREF #REQUIRED><!ELEMENT STUDENT (HOBBIES*)> The obtained XML structures (DTD) <!ATTLIST STUDENT S# ID #REQUIRED SNAME CDATA #IMPLIED> <!ELEMENT HOBBIES (#PCDATA)> <!ELEMENT DEPT (STUDENT2*)> <!ATTLIST DEPT D# ID #REQUIRED DNAME CDATA #IMPLIED> <!ELEMENT STUDENT2 (EMPTY)> <!ATTLIST STUDENT2D_S_REF IDREF #IMPLIED><!ELEMENT TUTORIAL(EMPTY)> <!ATTLIST TUTORIAL T# ID #REQUIRED TUTORIAL_TITLE CDATA #IMPLIED>
5 Conclusion Method proposed in this paper achieves • Generation of semantically sound XML structures for relational data possible • Generation of properly structured XML data without unnecessary redundancies and proliferation of disconnected XML elements possible
References [1] S. Banerjee, et al “Oracle 8i – The XML Enabled Data Management System”, Proc. 16th Int’l Conf. on Data Engineering, 2000 [2] G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, “ORA-SS: An Object- Relationship- ttribute Model for Semi-structured Data”, TR 21/00, NUS, 2001 [3] D.W. Lee, M. Mani, F. Chiu, W.W Chu, “Nesting-based Relational-to-XML Schema Translation”, Proc, 4th Int’l Workshop on Web and Databases, 2001 [4] T.W. Ling, M.L. Lee, “Relational to Entity-Relationship Schema Translation Using Semantic and Inclusion Dependencies”, In Journal of Integrated Computer-Aided Engineering, pages 125-145, 1995 [5] SYBASE, “Using XML with the Sybase Adaptive Server SQL Databases, A Technical Whitepaper”, http://www.sybase.com,2000 [6] V. Turau, “Making Legacy Data Accessible for XML Applications”, http://www.informatik.fh-wiesbaden.de/~turau/veroeff.html1999 [7] K. Williams, et al., “XML Structures for Existing Databases”, http://www- 106.ibm.com/developerworks/library/x-struct/ January 2001 [8] W.Y. Du, M.L. Lee, T.W. Ling, “XML Structures for Relational Data”, Proc. 2nd Int’l Conf. on Web Information Systems Engineering (WISE) , IEEE Computer Society, 2001