350 likes | 509 Views
Data Integration by Bi-Directional Schema Transformation Rules. By Peter McBrien and Alexandria Poulovassilis Presented by Suman Paladugu. Introduction. A new approach to data integration called both as view (BAV)
E N D
Data Integration by Bi-Directional Schema Transformation Rules By Peter McBrien and Alexandria Poulovassilis Presented by Suman Paladugu
Introduction A new approach to data integration called both as view (BAV) BAV is based on the use of reversible sequences of schema transformations Derive GAV and LAV view definitions from BAV schema transformation sequences Support of BAV in the evolution of both global and local schemas Implementation of the BAV approach within the AutoMed system
Example Local and Global Schemas • Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) • S1 ug (id, name, left #, degree, sno) tutor (sno, sname) • S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept)
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) G1 Student (id, name, left, degree) ={x, y, z, w | (x, y, z, w, -) ug Λ (x, -, -, -) phd V (x, y, z, w) phd Λ w=‘phd’ } Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) G2 monitors (sno, id) = {x, y | (x, -, -, -y) ug Λ (x, -, -, -) phd V (x, y) supervises} Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) G3 staff (sno, sname, dept) = {x, y, z | (x, y) tutor Λ (x, -, -) supervisor V (x, y) supervisor} Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L1 tutor (sno, sname) = {x, y | (x, y, -) staff Λ (x, z) monitorsΛ(z, -, -, w) student Λ w ‘phd’} Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L2 ug (id, name, left, degree, sno) = {x, y, z, w, v | (x, y, z ) student Λ (v, x) monitorsΛ w ‘phd’} Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L3 phd (id, name, left, title) = {x, y, z, w | (x, y, z, v) student Λ v = ‘phd’Λ w = null} Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L4 supervises (sno, id) = {x, y | (x, y) monitors Λ (x, -, -, z) student Λ z = ‘phd’} Example Local and Global Schemas
Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) S1 ug (id, name, left #, degree, sno) tutor (sno, sname) S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept) L5 supervisor (sno, sname, dept) = {x, y, z | (x, y, z) staff Λ (x, w,) monitors Λ (w, -, -, v) student Λ v = ‘phd’} Example Local and Global Schemas
Example Local and Global Schemas • Sg student (id, name, left #,degree) monitors (sno ,id) staff (sno, sname, dept#) • S1 ug (id, name, left #, degree, sno) tutor (sno, sname) • S2phd (id, name, left#, title) supervises (sno, id) supervisor (sno, sname, dept)
Evolution Problems of GAV and LAV • GAV not ready to support the evolution of local schema • In LAV, changes to a local schema impact only on the derivation rules defined for that schema • But there is a problem for LAV
BAV Integration • Common Data Model- HDM. • In LAV, changes to a local schema impact only on the derivation rules defined for that schema • Schemas are incrementally transformed by applying to them a sequence of primitive transformation stepst1, t2, t3……tn . • Intermediate (and final) schemas may contain constructs of more than one modeling language.
BAV Integration… Contd • Each add or del transformation is accompanied by a query specifying the extent of the new/deleted construct in terms of the rest of the constructs in the schema. • This allows automatic translation of data and queries between schemas linked by a transformation pathway e.g. for global query processing
Example: A Simple Relational Model • k1, k2, k3……kn , n≥1, are the primary key attributes • a1, a2, a3……am , m ≥ 0, are the non-primary key attributes
Primitive Transformation of this Model • addRel(( (R, k1, k2, k3……kn) ,q)) adds to the schema a new relation R • addAtt(( R, a), c, q)) adds to the schema a non-primary key attribute for relation R • delRel (((R, k1, k2, k3……kn ) ,q)) deletes relation R • delAtt (((R, a ),c, q))
Primitive Transformation of this Model • extRel(( R, k1, k2, k3……kn )), q)) • extAtt(( R, a)), c, q)) • conRel(( R, k1, k2, k3……kn )), q)) • conAtt(( R, a)), c, q))
BAV integration of S1 and S2 into Sg : ‘delete’ and ‘contract’ Steps
Correspondence between GAV/LAV • The ‘add’ steps correspond to GAV since global schema constructs are being defined in terms of local ones • The ‘del’ and ‘con’ steps correspond to LAV since local schema constructs are being defined in terms of global ones
Correspondence between BAV and GAV/LAV • GAV or LAV definition can be converted into a partial BAV definition • Complete GAV or LAV definition can be derived from a BAV definition. • BAV thus combines the benefits of GAV and LAV in the sense that any reasoning or processing which is possible with the view definitions of GAV or LAV will also be possible with the BAV definition
Deriving BAV from GAV • GAV definition is derived using some of the information present in BAV definition: • First, Decomposition rule applied to each GAV rule G1-- generates 1-4, G2 --8, and G3 generates 5-7 • Second, each construct c of type T in the source schema is removed using transformation step of form con T( c, void). • conAtt((( tutor, sname)), notnull, void))) • conAtt((( phd, title)), notnull, void))) • conRel((( phd, id)), void)
Deriving BAV from LAV • LAV definition is also derived using some of the information present in BAV definition: L1 to L5-- generates reverse transformation steps of 23-9. • All the BAV transformations steps generated must be ‘extend’ rather than ‘add’ ones • extRel((( phd, id)), {x |x (( student, id))} V (x, ‘phd’) ((student, degree)) }) • extAtt((( tutor, sname)), notnull, {x, y |(x, y) ((staff, sname)) x ((tutor, sno))})
Deriving GAV from BAV • Take the subset, G, of the add and ext steps in the transformation sequence from S1 U S2 U ……Sg • Take each addRel/extRel step in G, together with all addAtt/extAtt steps for the same relation • Form a join of the schemes ((R, a1))……((R, am)) to restore relation R.
Deriving LAV from BAV • Take the subset, L, of the del and con steps on constructs of Si in the transformation sequence from S1 U S2 U ……Sg . • Construction of the LAV view definitions from L proceeds in a similar fashion to the construction of GAV view definitions • E.g. the steps forming L for schema Si are 9 – 15 above • Rule L1 can then be derived from 9 – 10 and rule L2 from 11 – 15
BAV support for Global Schema Evolution • If a global schema S evolves to a new schema, S’ the evolution is specified as a transformation pathway SS’ • Three possible steps: 1.If t is an add or del, then S’ is semantically equivalent to S. 2. If t is a contract, then there will be information that used to be present in S no longer available from S’. 3. If t is an extend transformation then domain knowledge is required to determine if the new construct in S’ can in fact be completely derived from the local data sources
BAV support for Local Schema Evolution • Little Complex compared to the previous one • Suppose that some local schema S evolves, to S’ . The evolution is again defined as a transformation pathway S S’ • Each transformation step, t , in this pathway is again considered in turn • As with global schema evolution, only if t is an extend is domain knowledge required
Conclusions • GAV and LAV views can be derived from a BAV specification • BAV thus combines the benefits of GAV and LAV, in that any reasoning or processing which is possible with GAV or LAV view definitions will also be possible with a BAV specification • A key advantage of BAV is that it readily supports the evolution of both local and global schemas, allowing transformation pathways and schemas to be incrementally modified