210 likes | 377 Views
Parallel Database Systems Instructor: Dr. Yingshu Li Student: Chunyu Ai. Main Message. Technology trends give many processors and storage units inexpensively To analyze large quantities of data sequential (regular) access patterns are 100x faster
E N D
Parallel Database SystemsInstructor: Dr. Yingshu Li Student: Chunyu Ai
Main Message • Technology trends give • many processors and storage units • inexpensively • To analyze large quantities of data • sequential (regular) access patterns are 100x faster • parallelism is 1000x faster (trades time for money) • Relational systems show many parallel algorithms.
1k SPECint CPU 500 GB Disc 2 GB RAM Implications of Hardware Trends Large Disc Farms will be inexpensive (10k$/TB) Large RAM databases will be inexpensive (1K$/GB) Processors will be inexpensive So building block will be a processor with large RAM lots of Disc lots of network bandwidth
Why Parallel Access To Data? At 10 MB/s 1.2 days to scan 1,000 x parallel 1.5 minute SCAN. Bandwidth Parallelism: divide a big problem into many smaller ones to be solved in parallel.
Implication of Hardware Trends: Clusters CPU 50 GB Disc 5 GB RAM Future Servers are CLUSTERS of processors, discs ThesisMany Little will Win over Few Big
Parallelism: Performance is the Goal Goal is to get 'good' performance. • Law 1: parallel system should be • faster than serial system • Law 2: parallel system should give • near-linear scaleup or • near-linear speedup or • both.
Speed-Up and Scale-Up • Speedup: a fixed-sized problem executing on a small system is given to a system which is N-times larger. • Measured by: speedup = small system elapsed time large system elapsed time • Speedup is linear if equation equals N. • Scaleup: increase the size of both the problem and the system • N-times larger system used to perform N-times larger job • Measured by: scaleup = small system small problem elapsed time big system big problem elapsed time • Scale up is linear if equation equals 1.
Kinds of Parallel Execution Pipeline Any Any Sequential Sequential Program Program Sequential Sequential Any Any Sequential Sequential Sequential Sequential Partition outputs split N ways inputs merge M ways Program Program
The Drawbacks of Parallelism Startup: Creating processes Opening files Optimization Interference: Device (cpu, disc, bus) logical (lock, hotspot, server, log,...) Communication: send data among nodes Skew: If tasks get very small, variance > service time
Parallelism: Speedup & Scaleup Speedup: Same Job, More Hardware Less time 100GB 100GB Scaleup: Bigger Job, More Hardware Same time 1 TB 100GB Transaction Scaleup: more clients/servers Same response time 1 k clients 10 k clients 1 TB 100GB Server Server
Database Systems “Hide” Parallelism • Automate system management via tools • data placement • data organization (indexing) • periodic tasks (dump / recover / reorganize) • Automatic fault tolerance • duplex & failover • transactions • Automatic parallelism • among transactions (locking) • within a transaction (parallel execution)
Automatic Data Partitioning Split a SQL table to subset of nodes & disks Partition within set: Range Hash Round Robin Good for equi-joins, range queries group-by Good for equi-joins Good to spread load Shared disk and memory less sensitive to partitioning, Shared nothing benefits from "good" partitioning
0...9 10..19 20..29 30..39 40.. A..C G...M D..F S..Z N...R Index Partitioning • Hash indices partition by hash • B-tree indices partition as a forest of trees. • One tree per range • Primary index clusters data
Partitioned Execution Spreads computation and IO among processors Partitioned data gives NATURAL parallelism
N x M way Parallelism N inputs, M outputs, no bottlenecks. Partitioned Data Partitioned and Pipelined Data Flows
Blocking Operators = Short Pipelines • An operator is blocking, • if it does not produce any output, • until it has consumed all its input • Examples: • Sort, • Aggregates, • Hash-Join (reads all of one operand) • Blocking operators kill pipeline parallelism • Make partition parallelism all the more important. Database Load Template has three blocked phases Sort Runs Table Insert Tape Scan Merge Runs SQL Table File SQL Table Process Sort Runs Index Insert Merge Runs Index 1 Sort Runs Index Insert Merge Runs Index 2 Merge Runs Sort Runs Index Insert Index 3
Parallel Aggregates • For aggregate function, need a decomposition strategy: • count(S) = count(s(i)), ditto for sum() • avg(S) = ( sum(s(i))) / count(s(i)) • and so on... • For groups, • sub-aggregate groups close to the source • drop sub-aggregates into a hash river.
River is range or hash partitioned Merge runs Sub-sorts generate runs Range or Hash Partition River Scan or other source Parallel Sort M inputs N outputs Disk and merge not needed if sort fits in Memory • Sort is benchmark from hell for shared nothing machines • net traffic = disk bandwidth, no data filtering at the source
Hash Join: Combining Two Tables Right Table • Hash smaller table into N buckets (hope N=1) • If N=1 read larger table, hash to smaller • Else, hash outer to disk then • bucket-by-bucket hash join. • Purely sequential data behavior • Always beats sort-merge and nested • unless data is clustered. • Good for equi, outer, exclusion join • Lots of papers, • Hash reduces skew Hash Buckets Left Table
Q&A • Thank you!