110 likes | 254 Views
ICS 421 Spring 2010 Parallel & Distributed Databases 2. Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa. Shared-Nothing Architecture. CPU. CPU. CPU. CPU. Memory. Memory. Memory. Memory. Disk. Disk. Disk. Disk. Network.
E N D
ICS 421 Spring 2010Parallel & Distributed Databases 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa Lipyeow Lim -- University of Hawaii at Manoa
Shared-Nothing Architecture CPU CPU CPU CPU Memory Memory Memory Memory Disk Disk Disk Disk Network Lipyeow Lim -- University of Hawaii at Manoa
Logical Parallel DBMS Architecture • results • query Parallel DB layer Parallel DB layer • Catalog DB • DBMS • DBMS • DBMS • Data Fragments • Data Fragments Network Lipyeow Lim -- University of Hawaii at Manoa
Horizontal Fragmentation: Range Partition Partition 1 Partition 2 • Range Partition on rating column • Partition 1: 0 <= rating < 5 • Partition 2: 5 <= rating <= 10 Lipyeow Lim -- University of Hawaii at Manoa
Range Partition: Query Processing Partition 1 • Which partitions? • Better than non-parallel ? SELECT* FROMSailors S SELECT* FROMSailors S WHERE rating = 2 Partition 2 SELECT* FROMSailors S WHERE age > 30 SELECT* FROMSailors S WHERE rating < 2 and age < 30 Lipyeow Lim -- University of Hawaii at Manoa
Horizontal Fragmentation: Hash Partition Partition 1 • Hash partitioning using hash function • Partition = rating mod 2 Partition 2 Lipyeow Lim -- University of Hawaii at Manoa
Hash Partition: Query Processing Partition 1 • Which partitions? • Better than non-parallel ? SELECT* FROMSailors S SELECT* FROMSailors S WHERE rating = 2 Partition 2 SELECT* FROMSailors S WHERE age > 30 SELECT* FROMSailors S WHERE rating < 2 and age < 30 Lipyeow Lim -- University of Hawaii at Manoa
Vertical Fragmentation/Partition Partition 1 • Vertical partitioning • Use sid as row identifier Partition 2 Lipyeow Lim -- University of Hawaii at Manoa
Vertical Partition: Query Processing Partition 1 • Which partitions? • Better than non-parallel ? SELECT* FROMSailors S Partition 2 SELECTsname FROMSailors S SELECT* FROMSailors S WHERE rating = 2 SELECTsid FROMSailors S WHERE age > 30 SELECTsid FROMSailors S WHERE rating < 2 and age < 30 Lipyeow Lim -- University of Hawaii at Manoa
Fragmentation & Replication • Suppose table is fragmented into 4 partitions on 4 nodes • Replication stores another partition on each node • What happens when 1 node fails ? 2 nodes ? • What happens when a row needs to be updated ? • Node 1 • Node 2 • Node 3 • Node 4 P4 P1 P2 P3 Network P1 P2 P3 P4 Lipyeow Lim -- University of Hawaii at Manoa
What about joins ? Partition 1 SELECTR.sid, R.bid FROMSailors S, Reserves R WHERE S.sid=R.sid AND rating > 8 Sailors Reserves • Sailors: hash • part = rating mod 2 • Reserves: hash • part = sid mod 2 • Where to perform join ? • What data to ship ? Partition 2 Sailors Reserves Lipyeow Lim -- University of Hawaii at Manoa