Gamma DBMS (Part 2): Failure Management Query Processing

Gamma DBMS (Part 2): Failure Management Query Processing Shahram Ghandeharizadeh Computer Science Department University of Southern California

Failure Management Techniques • Teradata’s Interleaved Declustering • A partitioned table has a primary and a backup copy. • The primary copy is constructed using one of the partitioning techniques. • The secondary copy is constructed by: • Dividing the nodes into clusters (cluster size is 4), • Partition a primary fragment (R0) across the remaining nodes of the cluster: 1, 2, and 3. Realizing r0.0, r0.1, and r0.2.

Teradata’s Interleaved Declustering • When a node (say 1) fails, its backup copy processes requests directed towards the primary copy of R1. • Three backup fragments r1.2, r1.0 and r1.1. • Note that the load of R1 is distributed across the remaining nodes of the cluster.

Teradata’s Interleaved Declustering • MTTR involves: • Replacing the failed node with a new one. • Reconstructing the primary copy of the fragment assigned to the failed node, R1. • By reading r1.2, r1.0, and r1.1 from Nodes 0, 2, and 3. • Reconstructing the backup fragments assigned to the failed node: r0.0, r2.2, and r3.1.

Teradata’s Interleaved Declustering • When does data become unavailable?

Teradata’s Interleaved Declustering • When does data become unavailable? • When a second node in a cluster fails prior to repair of the first failed node in that cluster. • Note that it is a bit more complex than the discussion here.

Teradata’s Interleaved Declustering • What is the advantage of making the cluster size equal to 8?

Teradata’s Interleaved Declustering • What is the advantage of making the cluster size equal to 8? • Better distribution of the workload across the nodes in the presence of a failure.

Teradata’s Interleaved Declustering • What is the advantage of making the cluster size equal to 8? • Better distribution of the workload across the nodes in the presence of a failure. • What is the dis-advantage of making the cluster size equal to 8?

Teradata’s Interleaved Declustering • What is the advantage of making the cluster size equal to 8? • Better distribution of the workload across the nodes in the presence of a failure. • What is the dis-advantage of making the cluster size equal to 8? • Higher likelihood of data becoming unavailable.

Teradata’s Interleaved Declustering • What is the advantage of making the cluster size equal to 8? • Better distribution of the workload across the nodes in the presence of a failure. • What is the dis-advantage of making the cluster size equal to 8? • Higher likelihood of data becoming unavailable. • Tradeoff between load-balancing (in the presence of a failure) and data availability.

Gamma’s Chained Declustering • Nodes are divided into disjoint groups called relation clusters. • A relation is assigned to one relation cluster and its records are declustered across the nodes of that relation cluster using a partitioning strategy (Range, Hash). • Given a primary fragment Ri, its backup copy is assigned to node (i+1) mod M (M is the number of nodes in the relation cluster).

Gamma’s Chained Declustering • During normal mode of operation: • Read requests are directed to the fragments of primary copy, • Write requests update both primary and backup copies.

Gamma’s Chained Declustering • In the presence of failure: • Both primary and backup fragments are used for read operations, • Objective: Balance the load and avoid bottlenecks! • Write requests update both primary and backup copies. • Note: • Load of R1 (on node 1) is pushed to node 2 in its entirety. • A fraction of read request from each node is pushed to the others for a 1/8 load increase attributed to node 1’s failure.

Gamma’s Chained Declustering • MTTR involves: • Replace node 1 with a new node, • Reconstruct R1 (from r1 on node 2) on node 1, • Reconstruct backup copy of R0 (i.e., r0) on node 1. • Note: • Once Node 1 becomes operational, primary copies are used to process read requests.

Gamma’s Chained Declustering • Any two node failures in a relation cluster does not result in data un-availability. • Two adjacent nodes must fail in order for data to become unavailable.

Gamma’s Chained Declustering • Re-assignment of active fragments incurs neither disk I/O nor data movement.

Join • Hash-join • A data-flow execution paradigm

Example Join of Emp and Dept Emp join Dept (using dno) Dept EMP

Hash-Join: 1 Node • Join of Tables A and B using attribute j (A.j = B.j) consists of two phase: • Build phase: Build a main-memory hash table on Table A using the join attribute j, e.g., build a hash table on the Toy department using dno as the key of the hash table. • Probe phase: Scan table B one record at a time and use its attribute j to probe the hash table constructed on Table A, e.g., probe the hash table using the rows of the Emp department.

Hash-Join: Build • Read rows of Dept table one at a time and place in a main-memory hash table. dno % 7

Hash-Join: Build • Read rows of Emp table and probe the hash table. dno % 7

Hash-Join: Build • Read rows of Emp table and probe the hash table and produce results when a match is found. dno % 7

Hash-Join: Build • Termination condition is when all rows of the Emp table have been processed! dno % 7

Hash-Join • Key challenge:

Hash-Join • Key challenge: Table used to build the hash table does not fit in main memory! • Solution:

Hash-Join • Key challenge: Table used to build the hash table does not fit in main memory! • A divide-and-conquer approach: • Use the inner table (Dept) to construct n memory buckets where each bucket is a hash table. • Every time memory is exhausted, spill a fixed number of buckets to the disk. • The build phase terminates with a set of in-memory buckets and a set of disk-resident buckets. • Read the outer relation (Emp) and probe the in-memory buckets for joining records. For those records that map onto the disk-resident buckets, stream and store them to disk. • Discard the in memory buckets to free memory space. • While disk-resident buckets of inner-relation exist: • Read as many (say i) of the disk-resident buckets of the inner-relation into memory as possible. • Read the corresponding buckets of the outer relation (Emp) to probe the in-memory buckets for joining records. • Discard the in memory buckets to free memory space. • Delete the i buckets of the inner and outer relations.

Hash-Join: Build • Two buckets of Dept table. One in memory and the second is disk-resident. dno % 7

Hash-Join: Probe • Read Emp table and probe the hash table for joining records when dno = 1. With dno=2, stream the data to disk. dno % 7

Hash-Join: Probe • Those rows of Emp table with dno=1 probed the hash table and produce 3 joining records. dno % 7

Hash-Join: While loop • Read the disk-resident bucket of Dept into memory. dno % 7

Hash-Join: While loop • Read the disk-resident bucket of Dept into memory. • Probe it with the disk-resident buckets of Emp table to produce the remaining two joining records. dno % 7

Parallelism and Hash-Join • Each node may perform hash-join independently when: • The join attribute is the declustering attribute of the tables participating in the join operation. • The participating tables are declustered across the same number of nodes using the same declustering strategy. • The system may re-partition the table (see the next bullet) if its aggregate memory exceeds the size of memory the tables are declustered across. • Otherwise, the data must be re-partitioned to perform the join operation correctly. • Show an example!

Parallelism and Hash-Join (Cont…) • R join S where R is the inner table.

Data Flow Execution Paradigm • Retrieve all those Employees working for the toy department: SELECT * FROM Dept d, Emp e WHERE d.dno = e.dno and d.dname = Toy

Data Flow Execution Paradigm • Producer/Consumer relationship where consumers are activated in advance of the producers.

Data Flow Execution Paradigm • “Split Table” contains routing information for the records • The consumers must be setup in order to activate producers.

Gamma DBMS (Part 2): Failure Management Query Processing

Gamma DBMS (Part 2): Failure Management Query Processing

Presentation Transcript

STRESS MANAGEMENT

Spoken Dialogue Systems

Query Processing

Heart Failure

Using Management Information Systems

Drugs for treating heart failure

CMPT 454

CS411 Database Systems

SQL – Structured Query Langauge

Lecture 9 Query Optimization

5 Querying XML

Gamma Spectroscopy

Distributed Databases

Database Management Systems (DBMS)

Exploratory Failure Time Analysis and Copy Number Variation Inference

Airway Management

Data Processing of Resting-State fMRI (Part 1)

Congestive Heart Failure (CHF)