280 likes | 292 Views
This paper discusses the problem of adapting schema mappings when schemas evolve over time and proposes a composition-based approach for mapping adaptation. The approach uses schema mapping tools to construct evolution and provides a formal semantics for adaptation. The paper also presents the interplay between schema evolution and mapping composition and shows the practicality of the composition-based approach.
Semantic Adaptation of Schema Mappings when Schemas Evolve Cong Yu University of Michigan Lucian Popa IBM Almaden Research Center VLDB’05, Trondheim, Norway – Sep 2, 2005
Schema Mappings Schema S Schema T J I q q’ • Schema Mappings are logical, declarative, assertions that can describe relationships between schemas. • enough semantics to guide run-time, instance-level, transformation • e.g., GLAV mappings (or tuple-generating dependencies) • They are key elements in two main areas in information integration: • Data Exchange/Translation • Query Answering/Rewriting (or Federation) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Schema Evolution and Mapping Adaptation • Schemas evolve over time … Mappings may become invalid ! • A lot of effort goes into establishing mappings. How do we reuse them ? • Mapping Adaptation Problem [VMP’03] • Given: • mapping M from S to T, • changes/evolution of S to S’, or T to T’, or both, • Derive a “best” mapping M’ that: • is valid with respect to the new schemas, and • reflects the original mapping as much as possible Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Prior Solution: Incremental Method M move elem S T • [VMP’03] Incrementally adapts the mapping after each atomic change in the schemas (source and/or target). • Efficient and intuitive, for one or few changes. • However, for non-incremental evolution, there are drawbacks … M1 add elem T1 M2 T2 delete constraint M3 T3 rename elem … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
M T S Different evolution paths • The new schema may be radically different • The list of changes may not be known. • Evolution path must be discovered not necessarily unique • The method will ultimately be inefficient: • The algorithm must be applied at each atomic change • As we shall see, the resulting mapping may not be the expected one. Mn Tn Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Our Approach: Composition-Based M S Can use schema mapping tools (e.g., Clio) to construct E. T • Evolution itself is described as a schema mapping. • Concise, declarative, and expressive description of evolution. • Enables efficiency and can deal with arbitrary evolution • The adapted mapping is then obtained via composition. • Formal semantics of adaptation. • At high level, this is part of the model management vision [Ber03]. E M’ = M ° E T’ Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Main Contributions • We study the interplay between schema evolution and mapping composition • interesting in terms of both semantics and implementation • We show that the composition-based approach for mapping adaptation can be practical Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Outline of the Rest of the Talk • Incremental Approach vs. Composition Approach • Example (showing why composition is important) • Composition: Semantics and Algorithm • Transformation semantics specialized, more suitable for schema evolution, also more challenging • Optimization and Experiments • Compose only when necessary (Some mapping formulas are unaffected by the change) • Conclusion Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Simplified Example Source’ Source Target m: SuppPart (s, p) Λ PartOrder (p, o) PotentialSupp (s, o) ( GLAV mapping [Halevy01], or, source-to-target tgd [FKMP03] ) SuppPart LineItem m s p PotentialSupp li s p o qty s o PartOrder p o • The mapping m “exports” orderso and all their potential supplierss. • Schema evolution scenario: • Data arrives in “long” tuples, each relating an order, a part and an available supplier. • The mapping m must be adapted to use new schema Source’. Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Incremental Approach [VMP’03] Source’ Source Target m: SuppPart (s, p) Λ PartOrder (p, o) PotentialSupp (s, o) SuppPart LineItem • Pick a list of changes from Source to Source’ and rewrite mapping after each change. • (1) Move element SuppPart/s to PartOrder/s: SuppPart (p) Λ PartOrder (s, p, o) PotentialSupp (s, o) • (2) Delete SuppPart/p and (3) delete SuppPart. • (4) Rename PartOrder to LineItem, (5) add LineItem/liand (6) add LineItem/qty: m’: LineItem (li, s, p, o, qty) PotentialSupp (s, o) m s p PotentialSupp li s p o qty s o PartOrder p o Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Although small, our example already needs 6 schema changes. • For large schemas, this can become challenging • Furthermore, and somewhat surprisingly, the semantics of the adapted mapping may not be the “expected” one ! Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Loss of Semantics Source’ Source Target m: SuppPart (s, p) Λ PartOrder (p, o) PotentialSupp (s, o) SuppPart LineItem • The original mapping m joins orders with suppliers • However, m’ loses relevant suppliers • It only pairs an order with a supplier provided they appear in the same LineItem tuple • To retain the original semantics, we must look in different tuples ! m’’: LineItem (li, s, p, o, qty) ΛLineItem (li’, s’, p, o’, qty’) PotentialSupp (s’, o) m s p PotentialSupp li s p o qty m’: LineItem (li, s, p, o, qty) PotentialSupp (s, o) s o PartOrder p o Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
The incremental approach is a “mechanical” procedure that makes local changes to the mapping. • A sequence of good local changes may not necessarily yield the best global adaptation … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Mapping Composition Approach • We look at the evolution globally: • Describe evolution through a schema mapping Source’ Source. Source’ Source Target SuppPart LineItem m e1 s p e1: LineItem (l, s, p, o, q) -> SuppPart (s, p) e2: LineItem (l, s, p, o, q) -> PartOrder (p, o) PotentialSupp li s p o qty s o PartOrder p o e2 • Define the adapted mapping to be a mapping Source’ Target, equivalent (e.g., same data movement) to the sequence of the evolution mapping and the original mapping. • The previous m’’ satisfies the conditions for {e1,e2} and {m}. Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
The composition approach is a more systematic approach, with precise semantics, guaranteed to behave the “right” way in all situations. • Although it may appear simple in the previous example, mapping composition poses challenges … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Challenges in Composition Approach • Mapping language: • Must handle nesting and complextypes (as in XML Schema) • (details in the paper) • Furthermore, the usual mapping languages (GLAV, tuple-generating dependencies) are not closed under composition ! • Recent extension that ensures composability: second-order tgds [FKPT04]. • Main idea: add functions to gain needed expressive power • Semantics and Algorithm • Efficiency/Scalability Next … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Composition: Semantics and Algorithm Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Composition: Semantics • In mapping composition, we want to replace a sequence of schema mappings with one that is “equivalent” and avoids the middle schema. • What does “equivalent” mean ? • There are two semantics that we considered: • Relationship semantics • More general • Transformation semantics • More suitable, specialized Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Relationship Semantics • Mappings can be viewed as describing relationships between instances over the two schemas Rel (M12) = { (I1, I2) | (I1, I2) satisfies M12 } • Composition of relationships: Rel (M12) ◦ Rel (M23) = { (I1, I3) | there is I2 such that (I1, I2) satisfies M12 and (I2, I3) satisfies M23 } • [FKPT04, Melnik04] A mapping M13 is equivalent, to the sequence of M12 and M23, under the relationship semantics, if: Rel (M13) = Rel (M12) ◦ Rel (M23) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Example: Semantics and Algorithm S3 S2 S1 Student: “Unknown” student id Second-order tgd [FKPT04] E Takes’: sid name M Takes: • M13 correctly captures the equivalent relationship between instances of S1 and S3. • Instances (and function F) can exist a priori. • A student n must be paired with a course c • even when c is listed under a different student name n’, • provided the student id is the same: F(n) = F(n’) sid name course M: Takes (n, c) Student (F(n), n) Enrolls (F(n), c) E: Student (s, n) Enrolls (s, c) Takes’ (s, n, c) name course Enrolls: sid course 1. Substitution M13: Takes (n, c’) Takes (n’, c) F(n) = F(n’) Takes’ (F(n), n, c) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
However, if we assume that the function F is one-to-one, an important simplification can be made … M13: Takes (n, c’) Takes (n’, c) F(n) = F(n’) Takes’ (F(n), n, c) M’13: Takes (n, c’) Takes (n, c) Takes’ (F(n), n, c) 2. Reduction F(n) = F(n’) n = n’ 3. Minimization Equivalent relationship M’’13: Takes (n, c) Takes’ (F(n), n, c) Equivalent transformation • We can always make this assumption, if mappings are meant to describe transformations (i.e., generation of a target instance). • F is a Skolem function assigning unique student ids: n F(n) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Transformation Semantics • A mapping is a process (in the spirit of data exchange [FKMP03]): I2 = M12(I1) • Each mapping formula is a “generator” of target facts • Functions are one-to-one value generators • Theorem. Our composition algorithm produces the schema mapping with the equivalent transformation semantics: M13 (I1) = M23( M12(I1) ) (up to the renaming of nulls) Advantage of transformation semantics, in adaptation: simpler and more intuitive formulas ! Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Composition Algorithm: Further Details • The substitution step is more complex than shown: • Must handle nesting • Generate parameterized rules for set types in the middle schema • Reuse some of the mapping-based query rewriting techniques [YP04] • Minimization: • Good: it simplifies formulas and generates intuitive mapping. (all this is enabled by the transformation semantics) • Bad: it can be expensive (same as tableau minimization) … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Optimization and Experimental Results Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Full Adaptation Full adaptation Compose “whole” schema mappings (Compose all the formulas in the original mapping with all the formulas in the evolution mapping) • Inevitable when the schema evolution is drastic and affects most of the original mapping (non-incremental evolution) • Inefficient when the changes are small and localized (incremental evolution) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Compose Only When Necessary Mapping Pruning: 1. Detect those parts (formulas) M’o of the original mapping Mo that are affected by evolution. • Only M’o need to be adapted. 2. Only a subset M’e of the formulas in the evolution mapping Me play a role in the composition with M’o • The rest are redundant (PTIME containment-like analysis, see paper) 3. Compose M’o with M’e Big performance gain for incremental evolution and overall. Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Analysis of Evolution Scenarios Results based on Clio Benefits = 1 – adapted mappings / (blank-sheet mappings + missed mappings) We also have synthetic scenarios that show scalability of Mapping Pruning with increasing schema and mapping complexity Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
Conclusion • We studied: • Mapping composition techniques for mapping adaptation • Transformation semantics in the context of schema evolution • Designed and implemented a practical adaptation system • Mapping pruning (schema evolution specific) • To Do: • Optimization of composition in general • Improve performance of minimization Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05