1 / 28

Semantic Adaptation of Schema Mappings when Schemas Evolve

This paper discusses the problem of adapting schema mappings when schemas evolve over time and proposes a composition-based approach for mapping adaptation. The approach uses schema mapping tools to construct evolution and provides a formal semantics for adaptation. The paper also presents the interplay between schema evolution and mapping composition and shows the practicality of the composition-based approach.

mscott
Download Presentation

Semantic Adaptation of Schema Mappings when Schemas Evolve

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Adaptation of Schema Mappings when Schemas Evolve Cong Yu University of Michigan Lucian Popa IBM Almaden Research Center VLDB’05, Trondheim, Norway – Sep 2, 2005

  2. Schema Mappings  Schema S Schema T J I q q’ • Schema Mappings are logical, declarative, assertions that can describe relationships between schemas. • enough semantics to guide run-time, instance-level, transformation • e.g., GLAV mappings (or tuple-generating dependencies) • They are key elements in two main areas in information integration: • Data Exchange/Translation • Query Answering/Rewriting (or Federation) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  3. Schema Evolution and Mapping Adaptation • Schemas evolve over time … Mappings may become invalid ! • A lot of effort goes into establishing mappings. How do we reuse them ? • Mapping Adaptation Problem [VMP’03] • Given: • mapping M from S to T, • changes/evolution of S to S’, or T to T’, or both, • Derive a “best” mapping M’ that: • is valid with respect to the new schemas, and • reflects the original mapping as much as possible Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  4. Prior Solution: Incremental Method M move elem S T • [VMP’03] Incrementally adapts the mapping after each atomic change in the schemas (source and/or target). • Efficient and intuitive, for one or few changes. • However, for non-incremental evolution, there are drawbacks … M1 add elem T1 M2 T2 delete constraint M3 T3 rename elem … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  5. M T S Different evolution paths • The new schema may be radically different • The list of changes may not be known. • Evolution path must be discovered  not necessarily unique • The method will ultimately be inefficient: • The algorithm must be applied at each atomic change • As we shall see, the resulting mapping may not be the expected one. Mn Tn Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  6. Our Approach: Composition-Based M S Can use schema mapping tools (e.g., Clio) to construct E. T • Evolution itself is described as a schema mapping. • Concise, declarative, and expressive description of evolution. • Enables efficiency and can deal with arbitrary evolution • The adapted mapping is then obtained via composition. • Formal semantics of adaptation. • At high level, this is part of the model management vision [Ber03]. E M’ = M ° E T’ Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  7. Main Contributions • We study the interplay between schema evolution and mapping composition • interesting in terms of both semantics and implementation • We show that the composition-based approach for mapping adaptation can be practical Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  8. Outline of the Rest of the Talk • Incremental Approach vs. Composition Approach • Example (showing why composition is important) • Composition: Semantics and Algorithm • Transformation semantics  specialized, more suitable for schema evolution, also more challenging • Optimization and Experiments • Compose only when necessary (Some mapping formulas are unaffected by the change) • Conclusion Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  9. Simplified Example Source’ Source Target m: SuppPart (s, p) Λ PartOrder (p, o)  PotentialSupp (s, o) ( GLAV mapping [Halevy01], or, source-to-target tgd [FKMP03] ) SuppPart LineItem m s p PotentialSupp li s p o qty s o PartOrder p o • The mapping m “exports” orderso and all their potential supplierss. • Schema evolution scenario: • Data arrives in “long” tuples, each relating an order, a part and an available supplier. • The mapping m must be adapted to use new schema Source’. Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  10. Incremental Approach [VMP’03] Source’ Source Target m: SuppPart (s, p) Λ PartOrder (p, o)  PotentialSupp (s, o) SuppPart LineItem • Pick a list of changes from Source to Source’ and rewrite mapping after each change. • (1) Move element SuppPart/s to PartOrder/s: SuppPart (p) Λ PartOrder (s, p, o)  PotentialSupp (s, o) • (2) Delete SuppPart/p and (3) delete SuppPart. • (4) Rename PartOrder to LineItem, (5) add LineItem/liand (6) add LineItem/qty: m’: LineItem (li, s, p, o, qty) PotentialSupp (s, o) m s p PotentialSupp li s p o qty s o PartOrder p o Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  11. Although small, our example already needs 6 schema changes. • For large schemas, this can become challenging • Furthermore, and somewhat surprisingly, the semantics of the adapted mapping may not be the “expected” one ! Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  12. Loss of Semantics Source’ Source Target m: SuppPart (s, p) Λ PartOrder (p, o)  PotentialSupp (s, o) SuppPart LineItem • The original mapping m joins orders with suppliers • However, m’ loses relevant suppliers • It only pairs an order with a supplier provided they appear in the same LineItem tuple • To retain the original semantics, we must look in different tuples ! m’’: LineItem (li, s, p, o, qty) ΛLineItem (li’, s’, p, o’, qty’)  PotentialSupp (s’, o) m s p PotentialSupp li s p o qty m’: LineItem (li, s, p, o, qty)  PotentialSupp (s, o) s o PartOrder p o Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  13. The incremental approach is a “mechanical” procedure that makes local changes to the mapping. • A sequence of good local changes may not necessarily yield the best global adaptation … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  14. Mapping Composition Approach • We look at the evolution globally: • Describe evolution through a schema mapping Source’  Source. Source’ Source Target SuppPart LineItem m e1 s p e1: LineItem (l, s, p, o, q) -> SuppPart (s, p) e2: LineItem (l, s, p, o, q) -> PartOrder (p, o) PotentialSupp li s p o qty s o PartOrder p o e2 • Define the adapted mapping to be a mapping Source’  Target, equivalent (e.g., same data movement) to the sequence of the evolution mapping and the original mapping. • The previous m’’ satisfies the conditions for {e1,e2} and {m}. Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  15. The composition approach is a more systematic approach, with precise semantics, guaranteed to behave the “right” way in all situations. • Although it may appear simple in the previous example, mapping composition poses challenges … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  16. Challenges in Composition Approach • Mapping language: • Must handle nesting and complextypes (as in XML Schema) • (details in the paper) • Furthermore, the usual mapping languages (GLAV, tuple-generating dependencies) are not closed under composition ! • Recent extension that ensures composability: second-order tgds [FKPT04]. • Main idea: add functions to gain needed expressive power • Semantics and Algorithm • Efficiency/Scalability Next … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  17. Composition: Semantics and Algorithm Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  18. Composition: Semantics • In mapping composition, we want to replace a sequence of schema mappings with one that is “equivalent” and avoids the middle schema. • What does “equivalent” mean ? • There are two semantics that we considered: • Relationship semantics • More general • Transformation semantics • More suitable, specialized Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  19. Relationship Semantics • Mappings can be viewed as describing relationships between instances over the two schemas Rel (M12) = { (I1, I2) | (I1, I2) satisfies M12 } • Composition of relationships: Rel (M12) ◦ Rel (M23) = { (I1, I3) | there is I2 such that (I1, I2) satisfies M12 and (I2, I3) satisfies M23 } • [FKPT04, Melnik04] A mapping M13 is equivalent, to the sequence of M12 and M23, under the relationship semantics, if: Rel (M13) = Rel (M12) ◦ Rel (M23) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  20. Example: Semantics and Algorithm S3 S2 S1 Student: “Unknown” student id Second-order tgd [FKPT04] E Takes’: sid name M Takes: • M13 correctly captures the equivalent relationship between instances of S1 and S3. • Instances (and function F) can exist a priori. • A student n must be paired with a course c • even when c is listed under a different student name n’, • provided the student id is the same: F(n) = F(n’) sid name course M: Takes (n, c)  Student (F(n), n)  Enrolls (F(n), c) E: Student (s, n)  Enrolls (s, c)  Takes’ (s, n, c) name course Enrolls: sid course 1. Substitution M13: Takes (n, c’)  Takes (n’, c)  F(n) = F(n’)  Takes’ (F(n), n, c) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  21. However, if we assume that the function F is one-to-one, an important simplification can be made … M13: Takes (n, c’)  Takes (n’, c)  F(n) = F(n’)  Takes’ (F(n), n, c) M’13: Takes (n, c’)  Takes (n, c)  Takes’ (F(n), n, c) 2. Reduction F(n) = F(n’)  n = n’ 3. Minimization Equivalent relationship M’’13: Takes (n, c)  Takes’ (F(n), n, c) Equivalent transformation • We can always make this assumption, if mappings are meant to describe transformations (i.e., generation of a target instance). • F is a Skolem function assigning unique student ids: n  F(n) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  22. Transformation Semantics • A mapping is a process (in the spirit of data exchange [FKMP03]): I2 = M12(I1) • Each mapping formula is a “generator” of target facts • Functions are one-to-one value generators • Theorem. Our composition algorithm produces the schema mapping with the equivalent transformation semantics: M13 (I1) = M23( M12(I1) ) (up to the renaming of nulls) Advantage of transformation semantics, in adaptation:  simpler and more intuitive formulas ! Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  23. Composition Algorithm: Further Details • The substitution step is more complex than shown: • Must handle nesting • Generate parameterized rules for set types in the middle schema • Reuse some of the mapping-based query rewriting techniques [YP04] • Minimization: • Good: it simplifies formulas and generates intuitive mapping. (all this is enabled by the transformation semantics) • Bad: it can be expensive (same as tableau minimization) … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  24. Optimization and Experimental Results Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  25. Full Adaptation Full adaptation Compose “whole” schema mappings (Compose all the formulas in the original mapping with all the formulas in the evolution mapping) • Inevitable when the schema evolution is drastic and affects most of the original mapping (non-incremental evolution) • Inefficient when the changes are small and localized (incremental evolution) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  26. Compose Only When Necessary Mapping Pruning: 1. Detect those parts (formulas) M’o of the original mapping Mo that are affected by evolution. • Only M’o need to be adapted. 2. Only a subset M’e of the formulas in the evolution mapping Me play a role in the composition with M’o • The rest are redundant (PTIME containment-like analysis, see paper) 3. Compose M’o with M’e  Big performance gain for incremental evolution and overall. Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  27. Analysis of Evolution Scenarios Results based on Clio Benefits = 1 – adapted mappings / (blank-sheet mappings + missed mappings) We also have synthetic scenarios that show scalability of Mapping Pruning with increasing schema and mapping complexity Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

  28. Conclusion • We studied: • Mapping composition techniques for mapping adaptation • Transformation semantics in the context of schema evolution • Designed and implemented a practical adaptation system • Mapping pruning (schema evolution specific) • To Do: • Optimization of composition in general • Improve performance of minimization Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05

More Related