920 likes | 1.09k Views
Dagstuhl Seminar January 17, 2011 - Bidirectional Transformations. Transformations in Database Engineering Jean-Luc Hainaut, Anthony Cleve. Objectives of the tutorial to presents some important aspects of transformations in DB in an informal and intuitive way
E N D
Dagstuhl Seminar January 17, 2011 - Bidirectional Transformations Transformations in Database EngineeringJean-Luc Hainaut, Anthony Cleve
Objectives of the tutorial to presents some important aspects of transformations in DB in an informal and intuitive way with practical applications in mind mainly (but not exclusively) for non-DB communities
1. The context of databases 2. The concept of transformation in the DB context 3. Elementary and complex transformations 4. Representation of schemas in DB Engineering 5. About property preservation 6. Applications of transformations 7. Challenges Contents
What is a Database? The structured collection of the data necessary to keep the memory of an organization (structures, rules and facts) to act as a reliable and efficient data server for an application system data 1. The context of databases - Concepts, terms, issues Client program schema Information/data structures are known (a.o., for processing) through schema(s)
The schemas and models of a database Schemas in DBMS's (1973-2011) conceptual schema = uses interface = mapping data = instance of 1. The context of databases - Concepts, terms, issues Client program view n view 2 view 1 logical schema physical schema
The schemas and models of a database Standard DB design methodologies (1974-2011) Users requirements 1. The context of databases - Concepts, terms, issues Conceptual design conceptual schema Logical design logical schema Physical design View design physical schema view n Coding view 2 DDL code view 1
transformations OUVRAGE Numéro Titre AUTEUR Editeur écrit 0-N 1-N Nom Date 1re parution Prénom Mot clé[0-10] Présentation[0-1] id: Numéro 0-N de 1-1 EXEMPLAIRE Num série Date acquisition Localisation Etage Rayon Travée conceptual schema Etat[0-1] id: de.OUVRAGE Num série Logical schema (relational) physical schema (Oracle 11) 1. The context of databases - Concepts, terms, issues Un ouvrage est une oeuvre littéraire publiée. Il est caractérisé par son numéro identifiant, son titre, son éditeur, sa date de première parution, ses mots-clés (10 au maximum), une brève note de présentation (ces notes sont en cours de constitution), le nom et le prénom de ses auteurs. A un ouvrage correspondent un certain nombre d'exemplaires, qui en sont la matérialisation physique. ... Database design create database BIB create dbspace BIB_DATA; create table OUVRAGE ( NUMERO char(18) not null, TITRE varchar(60) not null, EDITEUR char(32) not null, DATE_1RE_PARUTION date not null, PRESENTATION varchar(255), primary key (NUMERO)) in BIB_DATA; . . . alter table EXEMPLAIRE add constraint FKDE foreign key (NUMERO)references OUVRAGE; . . . create unique index IDOUVRAGE on OUVRAGE (NUMERO); . . .
in UML: a meta-model is a formal system of abstract constructs that can be used to describe any situation pertaining to a modeling domain; the notation is an integral part of a meta-model; a model is an artefact made up of instances of constructs of a meta-model, and that specifies the structures of a definite situation of an application domain in the Database realm: a model is a formal system of abstract constructs that can be used to describe any situation pertaining to a modeling domain; can be given several notations; a schema is an artefact using the constructs of a model, and that specifies the structures of one definite situation of an application domain Examples: the relational model the Entity-relationship model the relational schema of the DAGSTUHL-ORG database. 1. The context of databases - Concepts, terms, issues The schemas and models of a database
Seen as an evolving bag of facts modelled by is a domain of comply with philosophy a way to see the world fact classes describes describes describes describes instance of instance of instance of schema meta-schema meta-meta-schema expressed into expressed into expressed into real world system instance of instance of meta- meta-model model meta-model data 1. The context of databases - Concepts, terms, issues The schemas and models of a database
1. The context of databases - Concepts, terms, issues The schemas and models of a database Abstraction levels and paradigms abstraction levels Paradigms (aka "data model") conceptual ER; EER; OO (UML; etc.); ORM; Bachman; RDF; . . . logical/view relational; OO (UML; etc.); object-relational; XML DTD; XML Schema; standard file; network; hierarchical; . . . physical Oracle 11g; Oracle 8; DB2 9.7; MySQL 5.5; IDS2; IMS; . . . code Oracle 11g SQL-DDL; IDS2 DDL; . . .
DB Analysis and Design across abstraction levels (from abstract to concrete) and modelling paradigms DB Reverse Engineering across abstraction levels (from concrete to abstract) and modelling paradigms DB Evolution same abstraction level - same modelling paradigm DB Migration same abstraction level - change of modelling paradigm others : refactoring, integration, view derivation, ETL, . . . several abstraction levels - several modelling paradigms 1. The context of databases - Engineering processes Database engineering processes
1. The context of databases - Engineering processes Database engineering processes DB Analysis and Design ER EER OO ORM conceptual relational OO Obj-relat network logical/view Oracle 8 Oracle 11g DB2 9.7 IDS2 physical Oracle 8 DDL Oracle 11g DDL DB2 9.7 DDL IDS2 DDL code
1. The context of databases - Engineering processes Database engineering processes DB Reverse Engineering ER EER OO ORM conceptual relational OO Obj-relat network logical/view Oracle 8 Oracle 11g DB2 9.7 IDS2 physical Oracle 8 DDL Oracle 11g DDL DB2 9.7 DDL IDS2 DDL code
1. The context of databases - Engineering processes Database engineering processes DB Evolution ER EER OO ORM conceptual relational OO Obj-relat network logical/view Oracle 8 Oracle 11g DB2 9.7 IDS2 physical Oracle 8 DDL Oracle 11g DDL DB2 9.7 DDL IDS2 DDL code Not recommended
1. The context of databases - Engineering processes Database engineering processes DB Migration ER EER OO ORM conceptual relational OO Obj-relat network logical/view Oracle 8 Oracle 11g DB2 9.7 IDS2 physical Oracle 8 DDL Oracle 11g DDL DB2 9.7 DDL IDS2 DDL code Not recommended
DDL code = DB-design(Users Requirements) DB-design = Coding o PhysDoLogD o ConcD Conceptual schema = ConcD(Users Requirements) Logical schema = LogD(Conceptual schema) Physical schema = PhysD(Logical schema) DDL code = Coding(Physical schema) ConcD = Analysis o NormalisationoIntegration etc. 2. The concept of transformation in the DB context DB engineering process modelling Most engineering processes are artefact transformations Users requirements Conceptual design Conceptual schema Logical design Logical schema Physical design Physical schema Coding DDL code
2. The concept of transformation in the DB context DB engineering process modelling An example (relational logical design)
2. The concept of transformation in the DB context DB engineering process modelling An example (transforming multivalued attributes)
2. The concept of transformation in the DB context DB engineering process modelling An example (transforming many-to-many relationship types)
2. The concept of transformation in the DB context DB engineering process modelling An example (transforming one-to-many relationship types)
A transformation T replaces a construct C in a schema S1 with another construct C', leading to schema S2 T schemas S2 S1 C C' 2. The concept of transformation in the DB context The concept of transformation
If the schema describes actual data, the transformation should also tell how to convert the data (t) ... 2. The concept of transformation in the DB context The concept of transformation T schemas S2 S1 C C' t data c' c
A transformation S is defined by two mappings T and t S= <T,t> T: structural mapping = syntax of S t: instance mapping = semantics of S T C C' = T(C) inst_of inst_of t c c' = t(c) 2. The concept of transformation in the DB context The concept of transformation - Definition
Mapping T can be specified with two predicates: P: minimal pre-condition Q: maximal post-condition S= <T,t> = <P,Q,t> 2. The concept of transformation in the DB context The concept of transformation - Definition
Expressing structural predicates P and Q • entity-type(E) there exists an entity type with name E 2. The concept of transformation in the DB context Specifying a transformation Value-based (more concise, a name denotes an object) Object-based (more general, a name is a property of an object) • entity-type(e) there exists an entity type denoted by e • name(e,E) the name of e is E must allow specification and reasoning (e.g., FOL, DL)
2. The concept of transformation in the DB context Specifying a transformation • Expressing structural predicates P and Q • entity-type(E) there exists an entity type with name E • attribute(O,A,m,M,T) object (with name) O has an attribute with name A, cardinality m-M and type T • id(O,Cp) object (with name) O has an identifier comprising components Cp • rel-type(R) there exists a rel-type with name R • role(R,r,E,m,M) rel-type R has a role with name r, played by E, with cardinality m-M
entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) attribute(CUSTOMER,Phone,0,5,string) id(CUSTOMER,{Cust#}) = 2. The concept of transformation in the DB context Specifying a transformation • Expressing structural predicates P and Q
2. The concept of transformation in the DB context Specifying a transformation P Q P = entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) attribute(CUSTOMER,Phone,0,5,string) id(CUSTOMER,{Cust#}) Q = entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) id(CUSTOMER,{Cust#}) entity-type(PHONE) attribute(PHONE,Phone,1,1,string) id(PHONE,{Phone}) rel-type(has) role(has,,CUSTOMER,0,5) role(has,,PHONE,1,N) = = CUSTOMER PHONE Cust# Phone Name id: Phone id: Cust# has 0-5 1-N
From now on: CUSTOMER PHONE Cust# Phone Name id: Phone id: Cust# has 0-5 1-N 2. The concept of transformation in the DB context Specifying a transformation P Q
S2= S1iff C: P1(C) C = T2(T1(C)) CUSTOMER PHONE Cust# Phone Name id: Phone id: Cust# has 0-5 1-N 2. The concept of transformation in the DB context Inverse transformations -1 T1 T2 Intuitively, S2undoes the effect of S1 at the structural level mappingt ignored
A transformation can ... augment the information contents of the schema decrease the information contents of the schema preserve the information contents of the schema more complex patterns exist 2. The concept of transformation in the DB context Reversible transformations
2. The concept of transformation in the DB context Reversible transformations • Transformation S1 is reversible if it preserves • the information contents of the source schema reversible= semantics preserving mappingt involved
A transformation can be ... not reversible: not semantics-preserving reversible: "one-way" semantics-preserving symmetrically reversible: fully semantics-preserving 2. The concept of transformation in the DB context Reversible transformations
Examples P: R(A,B,C); A B|C Q: R1(A,B); R2(A,C); R1[A] = R2[C]; P: R(A,B,C); A B|C reversible (Fagin's theorem) Q: R1(A,B); R2(A,C); symmetrically reversible 2. The concept of transformation in the DB context Reversible transformations P: R(A,B,C); not reversible Q: R1(A,B); R2(A,C);
A transformation is reversible if there is an inverse mapping for instances as well S1 is reversibleiff S2 = S1 : C: P(C) C = T2(T1(C)) c inst(C): c = t2(t1(c)) 2. The concept of transformation in the DB context Reversible transformations -1
S is symmetrically reversibleiffboth S and S are reversible S= <P,Q,t>S= <Q,P,t-1> SR-transformations are the most desirable operators in analysis, design, reverse engineering,migration, refactoring, and (partially) evolution processes 2. The concept of transformation in the DB context Symmetrically reversible transformations -1 -1
3. Elementary and complex transformations Elementary : cannot be decomposed into smaller SR-transformations Complex : can be decomposed into (more) elementary SR-transformations
DOCUMENT DocID Title BOOK Date-Published ISBN Keyword[0-10] Publisher id: DocID DOCUMENT id: ISBN COPY BOOK DocID AUTHOR ISBN ISBN Title Name 0-N Serial-No Publisher written 0-N 0-N Date-Published First-Name Date-Acquired id: ISBN Keyword[0-10] Origin of id: ISBN id: DocID Serial-No 1-1 ref: ISBN COPY Serial-No Date-Acquired id: of.BOOK Serial-No DOCUMENT DocID AUTHOR Title Name Date-Published First-Name Keyword[0-10] Origin DOCUMENT id: DocID DocID KEYWORD 0-N Title describe Keyword 0-10 1-N 0-N Date-Published id: Keyword WRITTEN id: DocID doc by 1-1 1-1 id: doc.DOCUMENT by.AUTHOR 3. Elementary and complex transformations Elementary transformations
Elementary transformations are building blocks for more complex operators challenge: Developing higher-level SR transformations with elementary SR-transformations 3. Elementary and complex transformations Elementary and complex SR-transformations
Three classes of complex SR-transformations compound transformations predicate-driven transformations model-driven transformations 3. Elementary and complex transformations
Compound transformations The composition of two transformations is a transformation The composition of two SR-transformations is an SR-transformation S1= <T1, t1> S2= <T2, t2> S12= S2oS1= <T2oT1, t2ot1> 3. Elementary and complex transformations
Compound transformations known known known known 3. Elementary and complex transformations new!
Transformations that apply to a set of qualified objects in the current schema S (p) whereS is a transformation p is a structural predicate interpretation: apply S to all the objects that satisfy p 3. Elementary and complex transformations Predicate-driven (conditional) transformations
We need a language for p structural (e.g., DL): complex and leading to huge expressions ad hoc : expressive, concise, parametric, but not generic, not closed ROLE_per_RT(I J): the number of roles of the current rel-type is between I and J ONE_ROLE_per_RT(1 2): the number of "one" roles (with cardinality ?-1) is between I and J MAX_CARD_of_ATT(I J): the maximum cardinality of the current attribute is between I and J DEPTH_of_ATT(I J): the level of the current attribute is between I and J 3. Elementary and complex transformations Predicate-driven (conditional) transformations
S (p) RT_into_ET(ROLE_per_RT(3 N)): transform all rel-types into an entity type (if they have at least 3 roles) RT_into_REF(ROLE_per_RT(2 2) and ONE_ROLE_per_RT(1 2)): transform all rel-types into referential attributes (if they are binary and one-to-many or one-to-one) INSTANTIATE(MAX_CARD_of_ATT(2 4)): instanciate amm attributes (if they are "slightly" multivalued: from 2 to 4values) ATT_into_ET_VAL(DEPTH_of_ATT(1 1) and MAX_CARD_of_ATT(5 N)): transform all attributes into an entity type (if they are at the top level and they are "strongly" multivalued: at least 5 values) 3. Elementary and complex transformations Predicate-driven (conditional) transformations
Goal: considering schema S1 in model M1, transform S1 into S2 that complies with model M2. Of course, as far as possible through SR-transformations! Example: considering the Entity-relationship schema S1, transform S1 into S2 that complies with the relational model. Of course, as far as possible without information loss! Structure: a compound transformation comprising predicate-driven trans-formations. Practical form: a transformation plan. 3. Elementary and complex transformations Model-driven transformation
Model-driven transformation Building principles: 1. Identify the constructs of M1 that violate M2 (called invalid) 2. For each invalid construct C, apply a transformation <T,t> = <P,Q,t> such that P(C) and T(C) satisfies M2 Things may be a bit more complex, requiring a compound transformation. Example: processing N-ary rel-types for relational compliance requires two successive transformations 3. Elementary and complex transformations