1 / 37

Multidatabase and Distributed Manipulations Part 1

Multidatabase and Distributed Manipulations Part 1. Witold Litwin. Plan. Introduction Technical problems Origin of the concept Approach : Centralized DB (ANSI-SPARC) Approach : DDB (top-down) Approach : Global Schema (bottom-up) Reference architectures

lan
Download Presentation

Multidatabase and Distributed Manipulations Part 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multidatabase and Distributed ManipulationsPart 1 Witold Litwin

  2. Plan • Introduction • Technical problems • Origin of the concept • Approach : Centralized DB (ANSI-SPARC) • Approach : DDB (top-down) • Approach : Global Schema (bottom-up) • Reference architectures • Multidatabase architecture • Federated architecture • Autonomy, semantic heterogeneity , common model

  3. Multidatabase Model • Single database model ANSI-SPARC : • The real universe is modeled as one DB • The real universe is modeled as multiple dbs • Autonomous • Semantically heterogeneous • Manipulated through a multibase language • "Multidatabase interoperability". Litwin, W. Abdellatif, A. Multidatabase systems: An advanced solution for global information sharing. Hurson, A., R., Bright, M., W., Pakzad, S., H., (Ed.). IEEE press, 1993

  4. Multidatabase Model Autres BDs sur Internet Rest. Cours & étudiants Mes-amis Bibliothèque Privé Paris 9 Employés Folio Cine Teletel

  5. Problems • Reference architecture • Semantic heterogeneity in presence of local autonomy • Common data model • Fonctions of MDB langage • Transactions • Protocols & standards • Performance

  6. Reference Architectures • Multidatabase architecture • Generalization of the ANSI-SPARC architecture • Federated dbs architecture • Generalization of the federated DB architecture • Others

  7. ANSI-SPARC Architecture Centralized Integrated BD (1960-70) ES - External Schema CS - Conceptual Schema PS - Physical or Internal Schema

  8. Distributed DB • Origin of the concept (years 1970) • WAN development ( 20 kb/s) • Overload of centralized dbs

  9. Ddb • Idea : distribution of functions other than local communication ("top-down" approach) • Which ones ? • Distr. Execution (OS) • File access • The DB • Then, what data model for CS ? • Network (codasyl) ? • Relational

  10. Relation Fragmentation Hotels (H#, Ville, Cat, #Chambres) 1 Fragment (H#, Cat, #Chambres) (H#, Ville)

  11. Problems • GS scalability • GS utility for a local user • Query performance (bad case) • Data migration from ldbs • IMS, IDMS, Socrate • Local applications

  12. Problemes With the "Bottom-up" Approach ES ES • Creation of GS • Semantic heterogeneity • Time for the integration / local restructuring autonomy • In other words: scalability of GS • Updates • Performance • Heterogeneos views GS CS CS CS PS PS PS GS Approach ("bottom-up")

  13. MDB Architecture Absence of (GS) ES ES • User may have data in multiple dabs ANSI-SPARC compatibles • One may face multiple css • In general, it will be impossible to create a GS GS CS CS CS PS PS PS GS Approach ("bottom-up")

  14. MBD Architecture Concept of the MDB Language • Language for definition and manipulationof collections of dbs (multidatabases) • Definition of MDB ESs • Perhaps GSs • Definition of MDB dependancies • Semantics, integrity, security, manipulation... • Formulation of MDB queries (explicitly) • Referencing DB names • With MDB joins... • Find in DB michelin and in DB gaumont all restaurants '**' and cinemas at the same street

  15. MBD Architecture Multibases • A multidatabase (MDB) is a collection of dbs with MDB language • E.G. MSQL • A collection of dbs without MDB langage is not an MDB, but just a collection of dbs • As a collection of flat files (tables) without a DB language , SQL for example, is not a DB

  16. Potential Multibases Other DBs sur Internet Rest. Cours & étudiants Mes-amis Bibliothèque Privé Paris 9 Employés Folio Cine Teletel

  17. MDB Architecture Concept of Internal Logical Sublayer • The legacy data models can be heterogeneous • Different SQL dialects • Relational, hierarchical, network • OO and object-relational • It is preferable to have one model at MDB layer • One needs a sub-layer for translations • Also local DBA may wish not to show some of the data at MDB layer • Solution: ILS - internal logical schema • Unknown of ANSI-SPARC • Called also gateway

  18. Multibase Architecture Result of All This : (W. Litwin & Al, Années 1980) Req. MDB ES multibase

  19. Federated Architecture (Hambiger & Mcleod, Années 1980) • Every DB should be autonomous • In general, there will be no GS • Global integration is against the autonomy • The dbs used in common should form a federation of autonomous dbs • Every DB in a federation should be provided with three schemes: • ES: export schema • IS: import schema • PS: private schema : for all the private data, of ES and of IS included • There should be some federation dictionary (FD)

  20. Federated Architecture (Hambiger & Mcleod, Années 1980)

  21. Comparison • MDB architecture focuses on the concept of MDB language • Federated architecture focuses on the concept of autonomy • No MDB language • But there is the notion of autonomy also in MDB arch. • MDB architecture is + decentralized • No equivalent of FD • Several DSs • Both architectures are popular

  22. Comparison MDB <-> Fed.

  23. DB Autonomy(localautonomy) • Capability to control local DB data by the DBA • Naming • Value Types • Data Structures • Physical Structures • Query Execution • Security • Priority to local queries

  24. Multidatabase Autonomy • Capability to controle multiple DBs by a DBA • Same aspects as for the local autonomy • Naming... • May create a conflict with a DB autonomy • Priority to local autonomy B1 B2 B3

  25. Semantic Heterogeneity • Differences in representations of the same real properties • Names André Andrew • Value Types • Representations • units of measure cm/s pied/h • precision 1 g  1 Kg • Data Structures One table in 2 NF several tables in 3 NF

  26. Solutions (partial) • Schemas + descriptive • Protocols + descriptive • Data Dictionaries • Thesaurus • Automatic Representation Conversions • Automatic Unit Conversion • Implicit Joints • Higher level models and manip. languages • IDL (Krishnamourthy, Litwin, Kent, ACM-Sigmod 92

  27. Common Models • Ext. Relational • EDA-SQL • MSQL (research) • ODBC Microsoft SQL • Object-Relational • SQL 3 • CCS language for inf. retr. DBs • Numerous gateways towards SQL • IMS SQL • Codasyl SQL

  28. UniSQL/M UniSQL/M IMS UniSQL Sybase Oracle

  29. Other gateways UniSQL/M IMS UniSQL Sybase Oracle

  30. Yet other gateways UniSQL/M IMS UniSQL Sybase Oracle EDA-SQL

  31. The future (ODBC, JDBC, UDA…) ODBC x

  32. Conclusion • Modern DBMSs are in general MDBMSs • UniSQL/M, Oracle, Sybase, MsAccess, SQL Server, InterBase, DB2... • MDB access requires new functions for the management of • autonomy • semantic heterogeneity • physical distribution of data

  33. Conclusion • technical solutions are basd on : • new reference architectures • multidatabase architecture • federated architecture • common data models • relational and object-relational • Gateways in rapid development • Any DBMS towards any other using ODBC, JDBC… • Relational or XML wrappers

  34. Conclusion • MDB Languages • MSQL et SQL-x ; x > 2 • New transaction models • Protocoles et Standards • ODBC, JDBC… • All this is dealt with more in depth • later on • in the class books & several others

  35. Exercises & Research Problems • All these already in the text • Difference between the concepts of a DB, DDB,MDB and FDB. • What does it mean «  reference architecture », as ANSI-SPARC for example ? • Differences between the architectures: « top-down » « bottom-up », multibase et federated. • Comment on the actual architecture of federated DBs in DB2 V. 6 and later (see DB2 Help or RedBooks at the IBM web site). • Comment on the actual architecture of federated DBs in SQL Server 2000 (see SQL Server 2000 Help or the papers at the MS web site). • Design SQL SELECT statements realizing the fragmentation of the Hotels DB. • Can you propose then how to dealwith the updates ? • Comment on the concept of ILS, of gateway and of mediator • What multidatabase common model is most used today ? • Comment the concept of local autonomy (what, why, how) • Provide examples of various types of semantic heterogeneity • Prove that usual associability of equijoints does not hold if units of measure of values to join may have different precisions • What are consequences for relational DBMSs ? • Propose an extension to SQL for units of measure and the corresponding query optimization (PH.D Thesis).

  36. TheEND

More Related