120 likes | 235 Views
Choosing an Order for Joins (16.6). Neha Saxena (214) Instructor: T.Y.Lin. Topics Covered. Introduction Significance of Left and Right Join Arguments Join Trees Left-Deep Join Trees Dynamic Programming to Select a Join Order & Grouping Dynamic Programming with More Detailed Cost Functions
E N D
Choosing an Order for Joins (16.6) Neha Saxena (214) Instructor: T.Y.Lin
Topics Covered • Introduction • Significance of Left and Right Join Arguments • Join Trees • Left-Deep Join Trees • Dynamic Programming to Select a Join Order & Grouping • Dynamic Programming with More Detailed Cost Functions • A Greedy Algorithm for Selecting a Join Order
Introduction • This section focuses on critical problem in cost-based optimization: • Selecting order for natural join of three or more relations • Compared to other binary operations, joins take more time and therefore need effective optimization techniques
Significance of Left and Right Join Arguments • The argument relations in joins determine the cost of the join • The left argument of the join is • Called the build relation • Assumed to be smaller • Stored in main-memory
The right argument of the join is • Called the probe relation • Read a block at a time • Its tuples are matched with those of build relation • The join algorithms which distinguish between the arguments are: • One-pass join • Nested-loop join • Index join
Join Trees • Order of arguments is important for joining two relations • Left argument, since stored in main-memory, should be smaller • With two relations only two choices of join tree • With more than two relations, there are n! ways to order the arguments and therefore n! join trees, where n is the no. of relations
Left-Deep Join Trees • Consider 4 relations. Different ways to join them are as follows
In fig (a) all the right children are leaves. This is a left-deep tree • In fig (c) all the left children are leaves. This is a right-deep tree • Fig (b) is a bushy tree • Considering left-deep trees is advantageous for deciding join orders
Dynamic Programming to Select a Join Order and Grouping • Three choices to pick an order for the join of many relations are: • Consider all of the relations • Consider a subset • Use a heuristic o pick one • Dynamic programming is used either to consider all or a subset • Construct a table of costs based on relation size • Remember only the minimum entry which will required to proceed
Dynamic Programming with More Detailed Cost Functions • Disadvantage of dynamic programming is that it does not involve the actual costs of the joins in the calculations • Can be improved by considering • Use disk’s I/O for evaluating cost • When computing cost of R1 join R2, since we sum cost of R1 and R2, we must also compute estimates for there sizes
A Greedy Algorithm for Selecting a Join Order • It is expensive to use an exhaustive method like dynamic programming • Better approach is to use a join-order heuristic for the query optimization • Greedy algorithm is an example of that • Make one decision at a time and never backtrack on the decisions once made