410 likes | 648 Views
PMIT-6102 Advanced Database Systems. By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University. Lecture 05 Distributed Database Design . Outline. Distributed Database Design Distributed Design Issues Fragmentation Data Allocation. Degree of Fragmentation.
E N D
PMIT-6102Advanced Database Systems By- JesminAkhter Assistant Professor, IIT, Jahangirnagar University
Outline • Distributed Database Design • Distributed Design Issues • Fragmentation • Data Allocation
Degree of Fragmentation finite number of alternatives Relations(Not to fragment at all) To fragment to the level of individuals tuples or To fragment to the level of individuals attributes Finding the suitable level of partitioning within this range
Correctness of Fragmentation • Completeness • Decomposition of relation R into fragments R1, R2, ..., Rn is complete if and only if each data item in R can also be found in some Ri • This property, which is identical to the lossless decomposition property of normalization • it ensures that the data in a global relation are mapped into fragments without any loss
Correctness of Fragmentation • Reconstruction • If relation R is decomposed into fragments FR={R1, R2, ..., Rn}, then there should exist some relational operator ∇such that R = ∇ Ri , • The reconstructability of the relation from its fragments ensures that constraints defined on the data in the form of dependencies are preserved.
Correctness of Fragmentation • Disjointness • If relation R is horizontally decomposed into fragments FR={R1, R2, ..., Rn}, and data item djis in Rj, then dj should not be in any other fragment Rk (k ≠ j ). • This criterion ensures that the horizontal fragments are disjoint. • If relation R is vertically decomposed, its primary key attributes are typically repeated in all its fragments (for reconstruction). • Therefore, in case of vertical partitioning, disjointness is defined only on the non-primary key attributes of a relation.
Allocation Alternatives • Non-replicated database • partitioned : each fragment resides at only one site • Replicated database • fully replicated : each fragment at each site • partially replicated : each fragment at some of the sites
Information Requirements • The information needed for distribution design can be divided into four categories: • Database information • Application information • Communication network information • Computer system information
Fragmentation • Horizontal Fragmentation (HF) • Vertical Fragmentation (VF) • Hybrid Fragmentation (HF)
Horizontal Fragmentation (HF) • Horizontal fragmentation partitions a relation along its tuples. • each fragment has a subset of the tuples of the relation. • There are two versions of horizontal partitioning: • Primary horizontal fragmentation • Derived horizontal fragmentation • Primary horizontal fragmentation of a relation is performed • using predicates that are defined on that relation. • Derived horizontal fragmentation is the partitioning of a relation • results from predicates being defined on another relation.
Database Information how the database relations are connected to one another, especially with joins. In the relational model, these relationships are also depicted as relations. cardinality of each relation: card(R) Information Requirements of HF SKILL TITLE, SAL L1 PROJ EMP ENO, ENAME, TITLE PNO, PNAME, BUDGET, LOC L2 L3 ASG ENO, PNO, RESP, DUR Expression of Relationships Among Relations Using Links
Information Requirements of HF The direction of the link shows a one to many relationship. For each title there are multiple employees with their title. Thus there is a link between PAY and EMP relations.
Information Requirements of HF • Application Information • simple predicates: Given R[A1, A2, …, An], a simple predicate pj is pj : AiθValue where θ {=,<,≤,>,≥,≠}, Value Diand Di is the domain of Ai. For relation R we define Pr= {p1, p2, …,pm} Example : PNAME = "Maintenance" BUDGET ≤ 200000
Information Requirements of HF • Application Information • mintermpredicates: Given R and Pr = {p1, p2, …,pm} define M = {m1,m2,…,mr} as M = { mi | mi= pjPrpj* }, 1≤j≤m, 1≤i≤z wherepj* = pj orpj* = ¬(pj). Example m1: PNAME="Maintenance" BUDGET≤200000 m2: NOT(PNAME="Maintenance") BUDGET≤200000 m3: PNAME= "Maintenance" NOT(BUDGET≤200000) m4: NOT(PNAME="Maintenance")NOT(BUDGET≤200000)
Information Requirements of HF • Application Information • In terms of quantitative information about user applications, we need to have two sets of data. • mintermselectivities: sel(mi) • The number of tuples of the relation that would be accessed by a user query which is specified according to a given minterm predicate mi. • access frequencies: acc(qi) • The frequency with which a user application qi accesses data. • Access frequency for a minterm predicate can also be defined.
Primary Horizontal Fragmentation • Definition: A primary horizontal fragmentation is defined by a selection operation on the owner relations of a database schema. Therefore, given relation R, its horizontal fragments are given by Ri= Fi(R), 1 ≤ i≤ w • where Fiis a selection formula used to obtain fragment Ri. If Fi is in conjunctive normal form, it is a minterm predicate (mi). • A horizontal fragment Riof relation R consists of all the tuples of R which satisfy a minterm predicate mi. • Given a set of minterm predicates M, there are as many horizontal fragments of relation R as there are minterm predicates. • Set of horizontal fragments also referred to as minterm fragments.
Primary Horizontal Fragmentation • PROJ1 = LOC=“Montreal” (PROJ) • PROJ2 = LOC=“New York” (PROJ) • PROJ3 = LOC=“Paris” (PROJ) Primary Horizontal Fragmentation of Relation PROJ
PHF – Algorithm Given: A relation R, the set of simple predicates Pr Output: The set of fragments of R = {R1,R2,…,Rw} which obey the fragmentation rules. Preliminaries : • Pr should be complete • Pr should be minimal
Completeness of Simple Predicates • A set of simple predicates Pr is said to be complete if and only if there is an equal probability of access by every application to any tuple belonging to any minterm fragment that is defined according to Pr. • Example : • Assume PROJ[PNO,PNAME,BUDGET,LOC] has two applications defined on it. • Find the budgets of projects at each location. (1) • Find projects with budgets less than $200000. (2)
Completeness of Simple Predicates According to (1), Pr={LOC=“Montreal”,LOC=“New York”,LOC=“Paris”} which is not complete with respect to (2). Modify Pr ={LOC=“Montreal”,LOC=“New York”,LOC=“Paris”, BUDGET≤200000,BUDGET>200000} which is complete.
Minimality of Simple Predicates • If a predicate influences how fragmentation is performed, (i.e., causes a fragment f to be further fragmented into, say, fiandfj) then there should be at least one application that accesses fi andfj differently. • In other words, the simple predicate should be relevant in determining a fragmentation. • If all the predicates of a set Pr are relevant, then Pr is minimal. • Pi is relevant if and only if • acc(mi) the access frequency of a mintermmi.
Minimality of Simple Predicates Example : Pr ={LOC=“Montreal”,LOC=“New York”, LOC=“Paris”, BUDGET≤200000,BUDGET>200000} is minimal (in addition to being complete). However, if we add PNAME = “Instrumentation” then Pr is not minimal.
COM_MIN Algorithm Given: a relation R and a set of simple predicates Pr Output: a complete and minimal set of simple predicates Pr' for Pr Rule 1: a relation or fragment is partitioned into at least two parts which are accessed differently by at least one application.
COM_MIN Algorithm • Initialization : • find a pi Prsuch that pi partitions R according to Rule 1 • set Pr' = pi ; PrPr– {pi}; F {fi} • Iteratively add predicates to Pr' until it is complete • find a pj Prsuch that pj partitions some fk defined according to minterm predicate over Pr' according to Rule 1 • set Pr' = Pr' {pj}; Pr Pr– {pj}; F F {fj} • if pk Pr' which is nonrelevant then Pr'Pr' – {pk} FF – {fk}
PHORIZONTAL Algorithm Makes use of COM_MIN to perform fragmentation. Input: a relation R and a set of simple predicates Pr Output: a set of minterm predicates M according to which relation R is to be fragmented • Pr' COM_MIN (R,Pr) • determine the set M of mintermpredicates • determine the set I of implications among piPr‘ • Iteratively eliminate the contradictory minterms from M • M M-mi
PHF – Example • Two candidate relations : PAY and PROJ. • Fragmentation of relation PAY • Application: Check the salary info and determine raise. • Employee records kept at two sites application run at two sites • Simple predicates p1 : SAL ≤ 30000 p2 : SAL > 30000 Pr = {p1,p2} which is complete and minimal Pr'=Pr • Minterm predicates m1 : (SAL ≤ 30000) m2 : NOT(SAL ≤ 30000) = (SAL > 30000)
SAL TITLE Mech. Eng. 27000 Programmer 24000 PHF – Example PAY1 PAY2 SAL TITLE Elect. Eng. 40000 Syst. Anal. 34000
PHF – Example • Fragmentation of relation PROJ • Applications: • Find the name and budget of projects given their location. • Issued at three sites • Access project information according to budget • one site accesses ≤200000 other accesses >200000 • Simple predicates • For application (1) p1 : LOC = “Montreal” p2 : LOC = “New York” p3 : LOC = “Paris” • For application (2) p4 : BUDGET ≤ 200000 p5 : BUDGET > 200000 • Pr = Pr' = {p1,p2,p3,p4,p5}
Fragmentation of relation PROJ continued Minterm fragments left after elimination m1 : (LOC = “Montreal”) (BUDGET ≤ 200000) m2 : (LOC = “Montreal”) (BUDGET > 200000) m3 : (LOC = “New York”) (BUDGET ≤ 200000) m4 : (LOC = “New York”) (BUDGET > 200000) m5 : (LOC = “Paris”) (BUDGET ≤ 200000) m6 : (LOC = “Paris”) (BUDGET > 200000) PHF – Example
PNO PNAME BUDGET LOC PHF – Example The result of the primary horizontal fragmentation of PROJ forms six fragments FPROJ = {PROJ1, PROJ2, PROJ3, PROJ4, PROJ5, PROJ} according to the minterm predicates M . Since fragments PROJ2, and PROJ5 are empty, they are not depicted in Figure PROJ3 PROJ1 PNO PNAME BUDGET LOC PNO PNAME BUDGET LOC New York P2 Database Develop. P1 Instrumentation 150000 Montreal 135000 PROJ4 PROJ6 P3 CAD/CAM 250000 New York P4 Maintenance 310000 Paris PNO PNAME BUDGET LOC
Completeness Since Pr' is complete and minimal, the selection predicates are complete Reconstruction If relation R is fragmented into FR= {R1,R2,…,Rr} R = RiFRRi Disjointness Minterm predicates that form the basis of fragmentation should be mutually exclusive. PHF – Correctness
Thank You Slide 39