130 likes | 216 Views
CSE 636 Data Integration. SchemaSQL Implementation. Architecture. Resident SQL Engine. Answers to queries Q 1 …Q n collected. Final Answer. Final Series of SQL Queries. . SchemaSQL Query. Final Answer. SchemaSQL Server. Federation User. Optimized local query Q 1.
E N D
CSE 636Data Integration SchemaSQL Implementation
Architecture Resident SQL Engine Answers to queries Q1…Qn collected Final Answer Final Series of SQL Queries SchemaSQL Query Final Answer SchemaSQL Server Federation User Optimized local query Q1 Optimized local query Qn DBMS1 DBMSn answer(Q1) … answer(Qn)
SchemaSQL Server • Maintains a Federation System Table (FST) • FST(db-name, rel-name, attr-name) • Names of databases, relations and attributes in the federation • Compiles the instantiations of the variables in the query • Enforces conditions, groupings, aggregations and mergings
Query Processing Fixed Output Schema Phase 1 • Corresponding to a set of variable declarations in the FROM clause, create VITs using one or more SQL queries against some local databases and/or the FST • VIT: Variable Instantiation Table whose schema consists of all the variables in one or more variable declarations in the FROM clause Phase 2 • Rewrite the original SchemaSQL query against the federation into an “equivalent” query against the set of VIT relations and compute it using the resident SQL server
Example SELECTRelC, C.salFloor FROM univ-CRelC, univ-C::RelCC, univ-D::salInfoD WHERERelC = D.dept AND C.salFloor > D.technician AND C.category = ‘technician’ univ-C: cs math univ-D: salInfo
Example – Phase 1 • VITRelC(RelC): • SELECT rel-name ASRelC • FROM FST • WHERE db-name = ‘univ-C’
Example – Phase 1 • VITC(RelC, CsalFloor): • SELECTRelCFROM VITRelC • If {r1, …, rn} is the answer in step 1, then VITC is computed by the following SQL query to univ-CSELECT ‘r1’ ASRelC, salFloor AS CsalFloorFROM r1WHERE category = ‘technician’UNION…UNION SELECT ‘rn’ ASRelC, salFloor AS CsalFloorFROM rnWHERE category = ‘technician’
Example – Phase 1 • VITD(Ddept, Dtechnician): • SELECT dept AS Ddept, technician AS Dtechnician • FROM salInfo
Example – Phase 1 VITRelCVITCVITD
Example – Phase 2 Joined Variable Instantiation Table (JVIT) is the (natural) join of the VITs generated during Phase 1 • CREATE VIEW JVIT(RelC,CsalFloor, Ddept, Dtechnician) ASSELECT VITRelC.RelC,VITC.CsalFloor, VITD.Ddept, VITD.Dtechnician FROM VITRelC,VITC, VITD WHERE VITRelC.RelC = VITD.Ddept AND VITRelC.CsalFloor > VITD.Dtechnician AND VITRelC.RelC = VITC.RelC • SELECTRelC,CsalFloorFROM JVIT
Example – Phase 2 (Aggregation) Q: Find the average salary floor across all departments for each employee category in database univ-B SELECT T.category, avg(T.D)FROM univ-B::salInfo D, univ-B::salInfo TWHERED <> ‘category’GROUP BYT.category univ-B: salInfo
Example – Phase 2 (Aggregation) Q: Find the average salary floor across all departments for each employee category in database univ-B SELECT T.category, avg(T.D)FROM univ-B::salInfo D, univ-B::salInfo TWHERED <> ‘category’GROUP BYT.category Aggregation After Phase 2 SELECT Tcategory, avg(TD)FROM JVITGROUP BY Tcategory
References • L. V. S. Lakshmanan, F. Sadri, I. N. Subramanian:SchemaSQL – A Language for Interoperability in Relational Multi-database SystemsVLDB, 1996 • L. V. S. Lakshmanan, F. Sadri, S. N. Subramanian:SchemaSQL – An Extension to SQL for Multidatabase InteroperabilityTODS, 2001