820 likes | 1.13k Views
Multidatabase manipulations Part 2. Witold Litwin. http://ceria.dauphine.fr/witold.html. Multidatabase manipulations (Kandinsky: Ligne avec Accompagnement, 1937 ). Multidatabase manipulations. MSQL (Litwin, Abdellatif, Nicolas, Zeroual, 1989 L. Suardi, M. Rusinkiewicz, 1992).
E N D
Multidatabase manipulations Part 2 Witold Litwin http://ceria.dauphine.fr/witold.html
Multidatabase manipulations(Kandinsky: Ligne avec Accompagnement, 1937 )
MSQL(Litwin, Abdellatif, Nicolas, Zeroual, 1989L. Suardi, M. Rusinkiewicz, 1992) • An extension to SQL • Contains by definition every SQL-x • Allows for non-procedural multidatabase base manipulations • MSQL queries impossible to formulate in SQL • An MSQL query may replace several SQL queries • Developed in 1986-89 • INRIA, projet B A BA, • initialement sous projet du projet-pilote SIRIUS (J. Le Bihan, puis W. Litwin) • Dr. Thesis of MM. Abdellatif, Nicolas, Zeroual • Compiler implemented at Houston University • Team od Prof. M. Rusinkiewicz, 1990-1993
MSQL(Litwin, Abdellatif, Nicolas, Zeroual, 1989L. Suardi, M. Rusinkiewicz, 1992) • Research vehicle for functions for the MBD environment • to address relations in different databases • to manipulate semantically heterogeneous data • to create MDB views • to transfer data (and schemas) between DBs • to define MDB dependencies • Present to limited extent in most of commercial DBMSs &DAMSs
MSQL(Basic new properties) • SQL Query • Uses 1st order predicate calculus • Is compiled for optimization into the relational algebra • Result is a table • MSQL Query • May use higher-order predicate calculus • Is compiled for optimization into the multirelational algebra • Result is a multitable • A set of relations (tables) • May be constituted from one or no tables
MSQL(More on functions specific to MDB env.) • Addressing of tables in different DBs • Implicitly or by qualification by (multi)database names • Introduced around 1985 by relational multibase system prototype MRDSM • B A BA project at INRIA • Unknown at that time of any relational language • See the overview of relational DBMSs existing in 1987 (M. Brodie)
MSQL(More on functions specific to MDB env.) • Manipulation of semantically heterogeneous data • Multiple Queries • With multiples identifiers • With semantic variables • Ranging over data names • Scale and Precision • Units of measure • Implicit joins • Capabilities still unknown of SQL • Capabilities known at present to some dialects • Limited with respect to MSQL
MSQL : example View View SIL = Internal Logical Schema
Conceptual Schemas (the multischema) DB bnp : br (br#, brname, street, street#, city, zipcode, tel)account (acc#, cl#, balance, br#)client (cl#, clname, cltel, cltype, street, street#, city, zipcode)spe-acc (acc#, br#, cl#, balance, curr) DB sg : branch (bra#, braname, street, s#, town, zip, t#, class)acc (acc#, bra#, c#, balance)client (c#, cname, ct#, ctype, street, s#, town, zip) DB cic : br (br#, brname, street, street#, city, zipcode, tel)account (ac#, br#, cl#, balance, open_date)client(cl#, clname, cltel, cltype, street, street#, city, zipcode)
Semantic Heterogeneity In Banks • Same names can designate different data • Different names can designate same data • same client, same town.. • The value of a primary key is valid only in one DB • how to identify same client in diff. banks ?
MSQL Commands • CREATE TABLE CREATE DATABASE • CREATE MULTIDATABASE CREATE VIEW • ALTER TABLE ALTER VIEW • ALTER MULTIDATABASE • DROP TABLE DROP DATABASE • DROP MULTIDATABASE DROP VIEW
MSQLCREATE DATABASE • > MSQL • CREATE DATABASE boulogne ; • CREATE DB |.com.org.user.boulogne ; • CREATE MULTIDATABASE Banks (bnp cic sg ); • USE Banks; • CREATE DATABASE boulogne FROM bnp ; Query scope
MSQLCREATE MULTIDATABASE • MSQL • CREATE MDB EC-Banks (f-banks-i-banks, s-banks, g-banks, e-banks ); CREATE MULTIDATABASE can create : • flat MDBs (only contain DBs) • nested MDBs (DBs or MDBs) • can be potentially any network of DBs or MDBs • like through the links on the WEB • what about cycles ?
MSQLCREATE TABLE Import use banks ; CREATE TABLE boulogne.loan FROM bnp.loan ; CREATE TABLE fake_checks (Chq# INT, Montant_Euro CURRENCY [EURO] .... ); One has created four (empty) tables :bnp. fake_checks , cic. fake_checks ... boulogne.fake_checks CREATE TABLE boulogne.client (c#, cn, ct#) FROM bnp.client (cl#, clname, cltel) PRIMARY KEY (c#) (cn, ct#) OUTER REFERENCES (clname, cltel); Unit of mesure
MSQLCREATE TABLE with References USE AuPrintemps /* MDB AuPrintemps CREATE TABLE MusicDep.Inventory …. FOREIGN KEY (Item#) REFERENCES Central.Stock(I#); No unauthorized Item# in the inventory of the Music Department • Other options • PRIMARY KEY (…) REFERENCES T(…) ; • [T1(A)] [LEFT|RIGHT] REFERENCES T2(B) ; • Generates implicit equijoin, or left or right implicit outerjoins when a query selects attributes A and B.
MSQLALTER MULTIDATABASE use banks ; alter banks include vernesremove cic • Alter MDB can create • flat MDBs (only contain DBs) • nested MDBs
MSQL Elementary queries Prefixing with DB names was unknown to SQL - and is in DB2 SQL since last year only
MSQL Default DB Tables of the default database are not prefixed
MSQL Elementary queries without prefixed names Table names are unique within the query scope
Updates USE (bnp b) sg ; UPDATE account SET account.balance = account.balance + 500 WHERE account.balance > acc.balance AND b.client.clname = sg.client.cname AND b.client.street = sg.client.street ; • What does it mean ?
Multiple Updates Begin Use BanksUpdate cl*set street = 'Charles de Gaulle"where street = 'Etoile' If SQLCODE <> 0 then Rollback ; Commit Use Banks vital cicUpdate cl*set street = 'Charles de Gaulle"where street = 'Etoile' • MSQL transaction semantics is more general than ACID • may include COMP (compensation) statement, list of accept. states....
Semantic Variables in MSQL use bnp sglet x be town cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli'
Semantic Variables in MSQL use bnp sglet x be town cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli' use bnp select * from br where town = 'Paris' and street = 'r. de Rivoli' use sg select * from branch where city = 'Paris' and street = 'r. de Rivoli'
Semantic Variables in MSQL use bnp sglet x be town cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli' Alternatively: use bnp sglet x be to% cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli'
Semantic Variables in MSQL use banks let X be banks.* select a%, balance, c%namefrom X.a% a, X.c% c where a. a%= c. c% • The query illustrates the multitable pair-wise join • Semantic variable a over relation name account is not necessary, but simplifies the typing of the query
banks FD Semantic Variables in MSQL • Semantic variables can be compound and with values selected by queries from some dictionaries use bankslet (x, y) be :select X.attr Y.attr from FD X, FD Y where X.mean = tel and Y.mean = city select * from clientwhere x = '123' and y = 'Paris' FD mean attr tel t# tel tel city city city town city burgh
Semantic Variables in MSQL • Can be applied to MSQL DD statements use banks create database cic2 ; let x be a% b% c% create table cic2.x from cic.x ; • Copies cic schema except for one table
Name homogenizationThe labels USE Banks ; LET t BE tel t#SELECT %name branch_name, t tel#, s%# street#FROM br% brWHERE street = ‘Champs Elysées’ ; The result : multitable: {( bnp.br.branch_name, bnp.br.tel#, bnp.br.street# ), ( sg.br.branch_name, sg.br.tel#, sg.br.street# )( cic.br. branch_name, cic.br.tel#, cic.br.street# )}
Multidatabase Views bnp my_bank sg A partial view of DBs bnp and sg in DB my_bank The views in my_bank can be considered Import Schemes
Multidatabase Union Views Use Banks Create View bnp.all-banks as Use banks let x be town city let y be banks.* Select y.br% ( y, br#, br%name branch, street, street#, x city, zip% zip, t% tel) Union * • Union * unions all the tables of the selected multitable • It scales to all the tables named br% of Banks, if new banks enters the MDB Banks in the future • Current DBMS, e.g., SQL Server, require to alter the union view definition in such a case
Key words and Aggregate Functions in MSQL • Key words and Aggregate Functions of SQL • par definition • DISTINCT, GROUP BY, ORDER BY • COUNT, AVG, SUM… • operate at each table of a multitable • Their extensions to multitables • MDISTINCT, MCOUNT, MGROUP BY, MORDER BYMAVG, MSUM... • operate at whole multitable • important for warehousing
Example USE BanksSELECT COUNT (*)FROM br% brWHERE street = 'champs elysées' ;
Example USE BanksSELECT COUNT (*)FROM br% brWHERE street = 'champs elysées' ; bnp.br2 cic.br2 sg.br2
Example USE BanksSELECT MCOUNT (*)FROM br% brWHERE street = 'champs elysées' ;
Example USE BanksSELECT MCOUNT (*)FROM br% brWHERE street = 'champs elysées' ; br6 • Exercises in warehousing : • -Average balance per client in each bank • Average balance per client in BANKS • Sum of client assets per bank • Sum of client assets in BANKS
Aggregate Functions IMPLEMENTATION ISSUES • All-in-one (traditional computation) • Possibly in parallel • The calculus can take long time. • Successive approximations • Some kind of sampling • result1, from any 1st DB to come • (result1 + result2) / 2 • … • sampling within each database • several ACM-Sigmod & VLDB papers dealt with query evaluation using sampling • Precomputing • Incremental evaluation using interdatabase dependencies • Common to warehousing
Aggregate Functions MERGE ON • form a single tuple from all the tuples of the same objet in the multitable • Uses outer jointures • Find millionaires in Banks and form the tuple for each millionaire USE Banks ;LET x.y BE clname.cltel cname.ct#LET z BE Banks.*SELECT *FROM z.a% WHERE z.a%.c%# = z.client.c%#AND z.a%.balance > 1 000 000MERGE ON x y ;
Aggregate Functions MERGE ON nulls nulls nulls • USE Banks ;LET x.y BE clname.cltel cname.ct#LET z BE Banks.*SELECT *FROM z.a% WHERE z.a%.c%# = z.client.c%#AND z.a%.balance > 1 000 000MERGE ON x y ;
Aggregate Functions NAME • Transform a name (table, attribute..) into attribute value USE Banks ;LET x.y BE br.city branch.townSELECT %name branch_name, NAME (.x) bankFROM xWHERE y = 'Nice' UNION * ; Note: Union * unions all the tables of the selected multitable • the result is the table :branch_name bank Jaures CIC DeGaulle BNP
Aggregate Functions CHOOSE • Chooses at most n tuples among the selected ones • the 1st found as does the function TOP (default) in any or some order, specified by ORDER BY (default) • strictly random (RND) • these that were not chosen by the previous execution of the query in the same transaction (NEW) • preferably in the DBs listed, and in the listed order • at most j per DB • selecting at most m tuples sharing the values of the attr. in the list A, supposed global key of some objet. CHOOSE (n, (m, <A>), [<B>]| j, [<B>], [RND | NEW] <A> ::= <list of attr.> <B> ::= <list of DBs>
Aggregate Functions CHOOSE • Choose a millionaire randomly USE Banks ;SELECT c.*FROM c% c, a% aWHERE c.c%# = a.c%# AND a.ba% > 1.000.000CHOOSE (1) RND ; • Function very important in MBD environment • information overload
Aggregate Functions TIMEOUT • Fix time limit of a query • the system should possibly deliver all the relevant tuples • however, any query arriving to timeout is considered executed successfully TIMEOUT (t [unit]) ;<unit> := ms | s | m | h | ds - seconds (default) USE BanksSELECT * FROM br%WHERE street = 'champs elysées' TIMEOUT (10) ;
Aggregate Functions POST • Make a query continuous • One manipulates each tuple found during the life time of the query • Even those created after the query start • TIMEOUT may be used to limit the life time USE Immo LaCentrale Orpi ;SELECT * FROM logem%WHERE prix < 1,000,000 AND Ville = 'Paris' POST ;
Aggregate Functions ESTIMATE • Compute the cost of a query before the execution and can start the execution after an authorizationESTIMATE (type, price, time, count, size, report) [WITH EXEC_PROMPT] • type of estimate : • exact (can be long to compute) • approximate • price of the query (in $, FF...). • completion time • number of tuples • size of the resultant, in bytes • report on the estimate itself • precision...
Privileges in MSQL USE bnp sg cic ; GRANT SELECT ON client TO Nicolas Abdellatif ; client is a multitable : client = (bnp.client, sg.client, cic.client) GRANT ALL ON etoile.account TO Nicolas Abdellatif FROM bnp.account ; GRANT ALL ON etoile.account TO Nicolas FROM Zeroual ON bnp.account ;
Interdatabase Queries • Transfer data between DBs • Source and target are multitables INSERT...