Status

Status • “Lifetime of a Query” • Query Rewrite • Query Optimization • Query Execution • Optimization • Use cost-estimation to iterate over all possible plans, pick one of minimum cost • Saw how to cost 1 relation ops • Saw how to cost joins • Saw that join ordering is complex • Inner vs. outer (e.g., AB ≠ BA) • Join ordering (e.g., A(BC) ≠ (AB)C) • Join type (e.g., nested loops vs. sort-merge) • We will return to Shapiro at the end of class O(2n-1)! Plans; n = 7 -> > 6 Billion

Selinger Pruning • How does Selinger reduce the search space? • Only considers left-deep plans • Pushes all cross products to the top • Uses a dynamic programming algorithm

(Basic) Selinger Dynamic Prog Alg. findBestPlan(JoinList S) if (bestPlan[S].cost ≠ ∞) ; array lookup ; independent of ; ordering of S return bestPlan[S] if (|S| = 1) bestPlan[S].plan = scan S bestPlan[S].cost = cost(scan S) return bestPlan[S] for each size 1 non-empty subset S1 of S P1 = findBestPlan(S1) P2 = findBestPlan(S - S1) A = best algorithm for joining P1, P2 ;inner v outer? cost = P1.cost + P2.cost + cost(A) if (cost < bestPlan[S].cost) bestPlan[S].plan ={execute P1.plan, execute P2.plan, join P1 and P2 using A } bestPlan[S].cost = cost return bestPlan[S]

Merge-Sort Phase 1: Repeat until S is done Read a run of S Sort Write out (Repeat with R) Phase 2: Read concurrently from each run of S and R Merge runs, then join overlapping regions

Simple Hash i=0 Choose partition of hash range {vi, vi+1} Scan S, hash, if in partition, insert into hash table Otherwise, write back out Scan R, hash, probe into hash table, output matches Otherwise, write back out Repeat with reduced R and S, in round i+1

GRACE Hash Choose sqrt(|R|) partitions, with one page memory per partition Hash R into partitions, flushing pages as they fill Hash S into partitions, flushing pages as they fill For each partition p Build a hash table H on R tuples in p Hash S tuples in p into H, output matches

Comparison GRACE Choose sqrt(|R|) partitions, with one page memory per partition Hash R into partitions, flushing pages as they fill Hash S into partitions, flushing pages as they fill For each partition p Build a hash table H on R tuples in p Hash S tuples in p into H, output matches Sort-Merge Phase 1: Repeat until S is done Read a run of S Sort Write out (Repeat with R) Phase 2: Read concurrently from each run of S and R Merge runs, then join overlapping regions Simple i=0 Choose partition of hash range {vi, vi+1} Scan S, hash, if in partition, insert into hash table Otherwise, write back out Scan R, hash, probe into hash table, output matches Otherwise, write back out Repeat with reduced R and S, in round i+1

Status

Status

Presentation Transcript

Status

Status

STATUS

Status…

Status :

Status

Status

Status

Status

Status

Status

Status

Status

status

STATUS

PHENIX STATUS PHENIX STATUS

Group Status Project Status

Status