100 likes | 269 Views
Translating Relational Schemas to XML Schemas. Dongwon Lee, Murali Mani, Frank Chiu, Wesley. W. Chu UCLA. Our goals. Map from relational to XML model automatically. Maintain semantic constraints during the mapping.
E N D
Translating Relational Schemas to XML Schemas Dongwon Lee, Murali Mani, Frank Chiu, Wesley. W. Chu UCLA
Our goals. • Map from relational to XML model automatically. • Maintain semantic constraints during the mapping. • Existing work such as XML Extender (from IBM), db2XML, XML-DBMS, SilkRoute, EXPERANTO rely on user-specified mapping, and do not maintain semantic constraints.
NeT: Nesting-based Translation • Nesting on a single attribute, X C, where C is column set of a table t “groups” the tuples of t with the same value for (C – X). • NeT is especially useful to: • Remove redundancies due to MVDs. • Group on attributes to get a more intuitive schema. • Semantic constraints are maintained.
nestC(t) t nestA(nestC(t))
NeT optimization • Perform minimum number of nesting. • Properties used for optimization: • Idempotency of nesting: we need not perform nesting more than once on any single attribute. • Nesting need not be done on an attribute that does not participate in any one candidate key. • Experimental results: NeT almost always decreases the data size tremendously – for one particular data set, size of the nested table is 6% of the size of the original table.
CoT: Constraint-based Translation: Step 1 • Consider IND s[] t[], where X, Y, is primary key, and is non-nullable. • If is unique, M(t) = (Y, s?), else M(t) = (Y, s*) • M(s) = (X - ) • Key for s is (Ks - ) M(professor) = (Pname, Age, student*) M(student) = (Sname, Course) <professor> <Pname>Muntz</Pname> <Age>60</Age> <student> <Sname>John</Sname> <Course>DB</Course> </student> <student> <Sname>John</Sname> <Course>N/W</Course> </student> </professor> student <professor> <Pname>Chu</Pname> <Age>55</Age> </professor> professor
CoT: Step 2 • Consider tables, s, t1, t2 with column set X, Y1, Y2, and INDs s[] t1[1], and s[] t2 [2], where 1, 2 are primary keys and , are non-nullable • Translate one IND as in Step1, and translate the other to IDREF as: M(t1) = (Y1, s*), M(t2) = (Y2), M(s) = (X - - ), A(t2) = {ID_t2::ID}, A(s) = {Ref_t2::IDREF} <professor> <Pname>Muntz</Pname> <Age>60</Age> <student Ref_course=“DB”> <Sname>John</Sname> </student> <student Ref_course=“N/W”> <Sname>John</Sname> </student> </professor> <course ID_course=“DB”/> <course ID_course=“N/W”/> <professor> <Pname>Chu</Pname> <Age>55</Age> </professor> course
CoT: Step 3 • Consider a relational schema with tables {t1, t2,…,tn} and INDs ti[i] tj[j] • Construct an IND-graph • Identify top-nodes as: • Nodes that do not have any IND are top-nodes • In a strongly connected component formed from table-set, S, if there is no IND from a node in S to a node outside S, then one of the nodes in S must be a top-node. • Perform BFS and translate the IND as in Step 1 or Step 2.
M (course) = (Cid, Title, Room, course*) M(prof) = (student*) M(student) = (Sid, Name) M(emp) = (Eid, Name, dept*, proj*) M(dept) = (Dno) M(proj) = (Pname) A(emp) = {ID_emp::ID, Ref_proj::IDREF} A(prof) = {Ref_emp::IDREF} A(proj) = {ID_proj::ID} IND-Graph
Conclusions • Automatically map from relational to a good XML model. • Maintain semantic constraints. • Remove some of the redundancies that could have been present in the relational model.