1 / 37

Distributed Database Design

This project explores the reference architecture of a Distributed Database Management System (DDBMS), including the design of global, fragmentation, and allocation schemas. It also covers the component architecture of a DDBMS, including the roles of Local DBMS, Data Communications, and Global System Catalog.

waynem
Download Presentation

Distributed Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Database Design Manal Ahmad Project Coordinator

  2. Reference Architecture of DDBMS • A set of global external schemas; • A global conceptual schema; • A fragmentation schema and allocation schema; • A set of schemas for each local DBMS conforming to the ANSI-SPARC three-level architecture.

  3. Reference Architecture of DDBMS

  4. Reference Architecture of DDBMS • Global conceptual schema The global conceptual schema is a logical description of the whole database, as if it were not distributed. This level corresponds to the conceptual level of the ANSISPARC architecture and contains definitions of entities, relationships, constraints, security, and integrity information.

  5. Reference Architecture of DDBMS • Fragmentation and allocation schemas The fragmentation schema is a description of how the data is to be logically partitioned. The allocation schema is a description of where the data is to be located, taking account of any replication.

  6. Reference Architecture of DDBMS • Local schemas Each local DBMS has its own set of schemas. The local conceptual and local internal schemas correspond to the equivalent levels of the ANSI-SPARC architecture. The local mapping schema maps fragments in the allocation schema into external objects in the local database.

  7. Reference Architecture for a Federated MDBS

  8. Component Architecture for a DDBMS • Local DBMS (LDBMS) component; • Data communications (DC) component; • Global system catalog (GSC); • Distributed DBMS (DDBMS) component;

  9. Component Architecture for a DDBMS

  10. Component Architecture for a DDBMS • Local DBMS (LDBMS) component The LDBMS component is a standard DBMS, responsible for controlling the local data at each site that has a database. It has its own local system catalog that stores information about the data held at that site. • Data communications (DC) component The DC component is the software that enables all sites to communicate with each other. The DC component contains information about the sites and the links.

  11. Global system catalog (GSC) The GSC has the same functionality as the system catalog of a centralized system. The GSC holds information specific to the distributed nature of the system, such as the fragmentation, replication, and allocation schemas. • Distributed DBMS (DDBMS) component The DDBMS component is the controlling unit of the entire system.

  12. Distributed Relational Database Design • Fragmentation • Allocation • Replication

  13. Distributed Relational Database Design • Fragmentation A relation may be divided into a number of sub relations, called fragments, which are then distributed. There are two main types of fragmentation: horizontal and vertical. Horizontal fragments are subsets of tuples andvertical fragments are subsets of attributes.

  14. Distributed Relational Database Design • Allocation Each fragment is stored at the site with “optimal” distribution. • Replication The DDBMS may maintain a copy of a fragment at several different sites.

  15. Fragmentation • Definition and allocation of fragments carried out strategically to achieve: • Locality of Reference • Improved Reliability and Availability • Improved Performance • Balanced Storage Capacities and Costs • Minimal Communication Costs.

  16. Fragmentation • Locality of Reference Where possible, data should be stored close to where it is used. If a fragment is used at several sites, it may be advantageous to store copies of the fragment at these sites. • Improved Reliability and Availability Reliability and availability are improved by replication: there is another copy of the fragment available at another site in the event of one site failing.

  17. Fragmentation • Improved Performance Bad allocation may result in bottlenecks occurring; that is, a site may become inundated with requests from other sites, perhaps causing a significant degradation in performance. Alternatively, bad allocation may result in underutilization of resources. • Balanced Storage Capacities and Costs Consideration should be given to the availability and cost of storage at each site, so that cheap mass storage can be used where possible. This must be balanced against locality of reference.

  18. Fragmentation • Minimal Communication Cost Consideration should be given to the cost of remote requests. Retrieval costs are minimized when locality of reference is maximized or when each site has its own copy of the data. However, when replicated data is updated, the update has to be performed at all sites holding a duplicate copy, thereby increasing communication costs.

  19. Data Allocation • Four alternative strategies regarding placement of data: • Centralized • Partitioned (or Fragmented) • Complete Replication • Selective Replication

  20. Data Allocation • Centralized Consists of single database and DBMS stored at one site with users distributed across the network. • Partitioned Database partitioned into disjoint fragments, each fragment assigned to one site.

  21. Data Allocation • Complete Replication Consists of maintaining complete copy of database at each site. • Selective Replication Combination of partitioning, replication, and centralization.

  22. Fragmentation • Why Fragmenting? • Usage • Efficiency • Parallelism • Security

  23. Why Fragmenting? • Usage Applications work with views rather than entire relations. Therefore, for data distribution, it seems appropriate to work with subsets of relations as the unit of distribution. • Efficiency Data is stored close to where it is most frequently used. In addition, data that is not needed by local applications is not stored. • Parallelism With fragments as the unit of distribution, a transaction can be divided into several subqueries that operate on fragments. This should increase the degree of concurrency, or parallelism, in the system, thereby allowing transactions that can do so safely to execute in parallel.

  24. Why Fragmenting? • Security Data not required by local applications is not stored and consequently not available to unauthorized users.

  25. Why Fragmenting? • Disadvantage of fragmentation • Performance The performance of global applications that require data from several fragments located at different sites may be slower. • Integrity Integrity control may be more difficult if data and functional dependencies are fragmented and located at different sites.

  26. Correctness of fragmentation • Completeness • Reconstruction • Disjointness

  27. Correctness of fragmentation • Correctness If a relation instance R is decomposed into fragments R1, R2, . . ., Rn, each data item that can be found in R must appear in at least one fragment. This rule is necessary to ensure that there is no loss of data during fragmentation.

  28. Reconstruction It must be possible to define a relational operation that will reconstruct the relation R from the fragments. This rule ensures that functional dependencies are preserved. • Disjointness If a data item d i, appears in fragment Ri, then it should not appear in any other fragment. Vertical fragmentation is the exception to this rule, where primary key attributes must be repeated to allow reconstruction. This rule ensures minimal data redundancy.

  29. Types of fragmentation • Horizontal • Vertical • Derived

  30. Types of fragmentation • Horizontal fragmentation Consists of a subset of the tuples of a relation.

  31. Horizontal Fragmentation • Horizontal fragmentation groups together the tuples in a relation that are collectively used by the important transactions. A horizontal fragment is produced by specifying a predicate that performs a restriction on the tuples in the relation. It is defined using the Selection operation of the relational algebra. • P1: δtype = “house"(PropertyForRent) • P2: δtype = “flat"(PropertyForRent)

  32. Horizontal Fragmentation

  33. Types of fragmentation • Vertical Fragment Consists of a subset of the attributes of a relation.

  34. Vertical Fragmentation • Vertical fragmentation groups together the attributes in a relation that are used jointly by the important transactions. A vertical fragment is defined using the Projection operation of the relational algebra. S1: πstaffno, position, sex, DOB, salary(Staff) S2: πstaffno, fname, lname, branchno(Staff)

  35. Vertical Fragmentation

  36. Types of fragment • Mixed Fragmentation Consists of a horizontal fragment that is subsequently vertically fragmented, or a vertical fragment that is then horizontally fragmented.

More Related