160 likes | 408 Views
Schema Mapping as Query Discovery. Renee J. Miller Laura M. Haas Mauricio A. Hernandez Presented by: Helen Chen. Introduction. Modern applications need schema mappings Current schema mapping process is done manually In Clio , schema mapping = query discovery
E N D
Schema Mapping as Query Discovery Renee J. Miller Laura M. Haas Mauricio A. Hernandez Presented by: Helen Chen
Introduction • Modern applications need schema mappings • Current schema mapping process is done manually • In Clio, schema mapping = query discovery • Modern DBMS manage not only data but also queries
Introduction (cont’) • Schema mappings cannot be fully automated • Outside sources are needed • Clio is a prototype tool for semi-automated schema mapping/query discovering
Characteristics of Clio • Clio is VC driven • VCs are an appropriate abstraction for eliciting information from the user or DBA • Using reasoning about queries and query containment can help the user derive correct schema mappings
Principle in Mapping Construction • All possible values in source target • Use union rather than join • A value from the source target • Use join rather than cross product • Override the principles is permitted once
Search Space • Vertical compositions (join) • Requires to consider mappings between schemas with constraints and dependencies • Horizontal compositions (set operators) • Source and target schemas do not represent the same information
Query Discovery Notation • Let S1, … Snrepresent the n source relation • Let T1, … Tmrepresent the m target relation • Use symbol A to denote source attributes • The domain of an attribute A is denoted dom(A) • The meta-data associated with A is denoted (A) • Use symbol B to denote target attributes
Query Discovery Notation (cont’) • Value correspondence i = <fi, pi> • A function (fi) • q >=1 • fi: dom(A1) x … dom(Aq) x m(A1) x … m(Aq) dom(B) • A filter (pi) • pi: dom(A1) x … dom(Ar) x m(A1) x … m(Ar) boolean
Potential Sets P Candidate Sets G All fi A Cover All source relations All pi Core Query Discovery Algorithm
Example • Consider the following value correspondences • f1: S1.A T.C • f2: S2.A T.D • f3: S2.B T.C • All three filters are True
Example (cont’) • P = {{1, 2},{2, 3},{1},{2},{3}} • G = {{1, 2},{2, 3},{1},{2},{3}} • Cover 1 = {{1, 2},{2, 3}} 2 = {{1},{2, 3}} … • SQL Query
q2: SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W, Student S WHERE P.Rank = W.ProjRank AND S.Yr = P.Rank q1: SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W WHERE P.Rank = W.ProjRank Another Example f1: PayRate(HrRate)*WorkdOn(Hrs) Personnel(Sal)
Another Example (cont’) f2: Professor(Sal) Personnel(Sal) p2: True f1: PayRate(HrRate)*WorkdOn(Hrs) Personnel(Sal) p1: True q3: SELECT P.HrRate*W.Hrs FROM PayRate P, WorksOn W, Student S WHERE P.Rank = W.ProjRank AND S.Yr = P.Rank UNION ALL SELECT Sal FROM Professor = {{1}, {2}}
Add/Delete a Value Correspondence u i i+1 Incremental Query Discovery Algorithm u’ … SQL Query
Conclusion • Schema mapping construction process is searching for the most reasonable mapping • Clio uses VCs to help users create schema mappings • Clio can produce both flat and nested relational targets • VC framework can be extended to both GAV and LAV
Limitation • VCs are entered by user of linguistic techniques – semi-automated