680 likes | 846 Views
Model-independent solutions to model management problems. Francesca Bugiotti Università Roma Tre. Model management. What is A systematic approach to metadata management, which handles schemas by means of a set of predefined operators. Its goals
E N D
Model-independent solutionsto model management problems Francesca Bugiotti Università Roma Tre
Model management • What is • A systematic approach to metadata management, which handles schemas by means of a set of predefined operators. • Its goals • Enhance the productivity of software developers, by offering them techniques that allow for high-level specifications and abstraction over recurring tasks involving the manipulation of schemas. Università Roma Tre
Model management • Model management systems • Handle schemas and mappings and support a wide variety of operations on them. • MIDST • We propose MIDST[1,2,3], a platform originally conceived for model-independent schema and data translation, as the basis to build a model management system. • The so built model management system aims at being model-independent and model-aware. Università Roma Tre
What model management addresses • Concrete needs: they are a formalization of concrete and frequent database maintenanceproblems • data integration over heterogeneous databases • data exchange between independent databases • ETL • wrapper generation for the access to relational databases from object-oriented applications • web site generation from databases. Università Roma Tre
What model management addresses • Model management solutions to formalized problems: • schema integration • schema evolution • forwardengineering • round-trip engineering • … Università Roma Tre
Schema integration S3 S1 S2 map23 map12 S1 S2 Università Roma Tre
Forwardengineering V2 V1 map1 map2 S1 S2 S2 Università Roma Tre
Round-tripengineering S2 S1 map1 map2 I1 I2 I2 Università Roma Tre
Model management problems solution • Solutions to model management problems are given in terms of scripts. • A script is a set of model management operators which are executed according to a specific control flow. Università Roma Tre
Operators • The operators involved in the script specifications are: • Match • Diff • Merge • Compose • Modelgen • Copy • … Università Roma Tre
Match • Given two schemas S1 and S2, we define map12 = MATCH(S1,S2) where MATCH is the operator identifying correspondences between the two schemas and hence yielding a possible mapping. • There are several algorithms implementing MATCH operators. Università Roma Tre
Match S2 S1 A B C D E A A B B Match(S1,S2) = ? Università Roma Tre
Match S2 S1 A B C D E A A B B Match(S1,S2) = map12 S2 S1 A B C D E A A B B Università Roma Tre
Diff • Given two schemas S and S1 the difference diff(S, S1) is a schema S2 that contains all the schema elements of S that do not appear in S1. • It can be interpreted as a set-oriented difference. Università Roma Tre
Example S1 S A B C D E A A B B Diff(S,S1) = ? Università Roma Tre
Example S1 S A B C D E A A B B Diff(S,S1) = S2 S2 C D E A A B Università Roma Tre
Merge • Given S and S1, their merge merge(S, S1) is a schema S2 that contains the schema elements that appear in at least one of S or S1, modulo equivalence. • It can be interpreted as a set-oriented union. Università Roma Tre
Example S1 S A B C D E A A B F Merge(S,S1) = ? Università Roma Tre
Example S2 S1 A B C D E A A B F Merge(S1,S2) = S3 S3 A B F C D E A B Università Roma Tre
Compose • Given three schemas: S1, S2, S3 and two mappings, map12 between S1 and S2 and map23 between S2 and S3, we define map13 as the composition of map12 and map23 as the mapping between S1 and S3. Compose(S1, S2,S3, map12, map23) = map13 Università Roma Tre
Modelgen • Given a schema S of a source model M and a target model M 1 , the translation modelgen(S, M 1 ) is a schema S1 of M1 that corresponds to S . Università Roma Tre
Modelgen M = ER Model S M1 = RelationalModel Modelgen(S,M1) = ? Università Roma Tre
Example S1 S Modelgen(S,M1) = S1 Università Roma Tre
Operators • A major goal is to provide model-independent operators, which guarantee some kind of model closure property. • Here we move from a simplified version of Bernstein’s solving procedure for the round-trip engineeringproblem[4], in order to introduce the needed operators and explain how they are implemented in a model-independent fashion. Università Roma Tre
Round-trip engineering • One of the mostmeaningfulmodel management problems. • Let us take it as an example to illustrate our approach to model management problems. S1 S2 S1: specification schema I1: animplementationschema obtainedfromS1 I2: a modifiedversionofthe implementation I2 S2: a new specification which corresponds to I2. I1 I2 Università Roma Tre
Round-trip engineering S1 S1is the specification schema which is translated into its corresponding implementation schema I1. It is a common example where the specificationisexpressed in ER and the implementationisrelational. The translation might be performed using MIDST itself, since it was conceived as an implementation of the MODELGEN operator. PCode (0,N) SSN Project (1,1) Manager Name Title EID I1 Project (PCode, Title, MGRSSN*) Manager (SSN, EID, Name) Università Roma Tre
Round-trip engineering I2is the implementation schema which is a modified version of I1. The transformation involves a change in the key of a referred relation. The key of Manager, which is referred by MGRSSN of Project in I1, becomes EID in I2. As a consequence, the column MGRSSN of Project, referencing SSN ofManager, has to reference EID. MGRID is the version of MGRSSN modified accordingly. I1 Project (PCode, Title, MGRSSN*) Manager (SSN, EID, Name) I2 Project (PCode, Title, MGRID*) Manager (SSN, EID, Name, Degree) Università Roma Tre
Round-trip engineering I2 Our goal is to generate S2, the appropriately revised version of the specification schema, such that its corresponding implementation is I2. Project (PCode, Title, MGRID*) Manager (SSN, EID, Name, Degree) S2 ER ? Università Roma Tre
Operators in scripts • The solution which has been provided for the round-trip engineering is based on a set of model management operators: DIFF, MERGE and MODELGEN. • DIFF and MERGE have been used to compute the difference and the union of schemas. • MODELGEN hasbeenused as a solution to translate the specification schema into the implementation and to compute the reversed differences. Università Roma Tre
The Round-trip solving script Università Roma Tre
Midst and Modelgen • The platform MIDST was originally conceived as a framework to perform model-independent schema and data translations. • MIDST was designed as a model-generic implementation of MODELGEN. Università Roma Tre
Translations Entity Relationship WSM XSD Object Oriented Object Relational XSD Object Relational Relational Università Roma Tre
Translations Entity Relationship WSM XSD Object Oriented Object Relational XSD Object Relational Relational Università Roma Tre
The metamodel approach • The constructs in the various model are rather similar: • Can be classified into a few categories (“metaconstructs”) IE: the entity of the ER, the Object of the OO can be reconduct to the same abstract concept, the “Abstract” of our supermodel. Università Roma Tre
The supermodel • A model that includes all the meta-constructs (in their most general forms) • Each model is subsumed by the supermodel (modulo construct renaming) • Each schema for any model is also schema for the supermodel (modulo construct renaming). Università Roma Tre
Translations specification • Translations can be defined on metaconstructs • And there are standard accepted ways to deal with translation of metaconstructs • They can be performed within the supermodel • Each translation from the supermodel SM to a target model M is also a translation from any other model to M. Università Roma Tre
Translation specification • The Datalog is used to specify the translation A translation script in ourtoolis a set of datalog rules. Università Roma Tre
Datalog • Declarative language • We specify the condition for the insertion • Forevery set ofconstruct that matchs the conditions in B we create a newconstruct A A <- B Università Roma Tre
Datalog rule example • We generate a new Abstract for each Aggregation Abstract( OID: SK1(oid), Name: name ) Aggregation( OID: oid, Name: name ); Università Roma Tre
Another rule We copy only Lexical of Aggregation Lexical ( OID: SK1(oid), aggregationOID: SK2(aggOID), Name:name, isIdentifier:isId, isNullable:isN, isOptional:isO, type:t) <- Lexical ( OID: oid, aggregationOID: aggOID, Name:name, isIdentifier:isId, isNullable:isN, isOptional:isO, type:t), Aggregation( OID:aggOID); Università Roma Tre
Approach • It is possible to apply the same approach to other model management operators? • How can we define other operators with respect to our supermodel? Università Roma Tre
Construct characteristics • Every costruct has: • An identification OID • A name • A set of properties • A set of references Università Roma Tre
Construct characteristics • Every costruct has: • An identification OID • A name • A set of properties • A set of references SM_Lexical ( OID: SK1 oid, aggregationOID: aggOID, Name:name, isIdentifier:isId, isNullable:isN, isOptional:isO, type:t ) Università Roma Tre
Construct equivalence • Two constructs are equivalent if they have: • The same name • The same set of properties • And refer to equivalent costructs Università Roma Tre
Comparison • There is a recursive definition of equivalence. • We can order the construct and start the matching from the constructs without references. Università Roma Tre
Constructcharacteristics SM_Lexical ( OID: SK1(oid), aggregationOID: SK2(aggOID), Name:name, isIdentifier:isId, isNullable:isN, isOptional:isO, type:t ) <- SM_Lexical ( OID: oid, aggregationOID: aggOID, Name:name, isIdentifier:isId, isNullable:isN, isOptional:isO, ), SM_Aggregation( OID:aggOID ); • Those can be found also in the rules • An identification OID • A name • A set ofproperties • A set of references Università Roma Tre
Example • An equivalence comparison may work as follows: • 1.comparison of the aggregations or abstracts without any references; • 2. comparison of constructs which may refer to them Università Roma Tre
Model management operators by examples An Example of a possible implementation of model management operators follow. The adopted language is Datalog. The tool is MIDST. Università Roma Tre
Datalog implementation of equivalence . • Fundamental functional block to compare two constructs: EQUIV_Aggregation [DEST] ( OID1: oid1, OID2: oid2) <- SM_Aggregation [SOURCE_1] ( OID: oid1, Name: name), SM_Aggregation[SOURCE_2] ( OID: oid2, Name: name ); Università Roma Tre
Datalog implementation of difference - merge • Fundamental functional block used to implement a SELECTIVE COPY. SM_Aggregation( OID: SK(oid), Name: name ) <- SM_Aggregation ( OID: oid, Name: name ), !EQUIV_Aggregation ( OID1: oid ); • Used both in difference and in merge. Università Roma Tre