Join Algorithms

7 Join Algorithms

Join Algorithms - 1 • There are 6 join Algorithms • These are the 6 ways that the optimiser can choose to use to solve a join • They all have differing advantages • Some may be better for larger or smaller tables • Some may work better for complex (more than 2 table) joins

Join Algorithms - 2 • You have some limited ability to affect the join algorithms used • This is probably not that useful • Knowledge of the joins and there strengths and weaknesses can help in the decoding of the query plans • This can then allow you to see where the query may be running slowly – and change this

The Big 6 • The 6 Join Algorithms are : • Nested Loop Join NLJ • Nested Loop Pushdown NLPD • Hash Join HJ • Hash Join Pushdown HPDJ • Sort Merge Join SMJ • Sort Merge Pushdown SMPDJ

Nested Loop Join • Take Small Table (or result set) • Store it • For each row in the Big Table, compare with each row in the Small Table Store

Nested Loop Pushdown • Reverse of the Nested Loop • For each Row in the Small Table • Probe a fast index(LF or HG) on the Big Table

Hash Join (Classic Hash) • Create a hash table for the Small Side • For each row in the Big Table probe the Hash Table for Join Conditions • Used when small side < 1,250,000 rows in a two table join • Used when the ratio between the large side and small side <600:1 in a more than two table join • “Classic Hash” - needs to be kept in memory

R1 I.1 I.2 T1 T2 T3 T4 Hash Join Pushdown • If the Distinct Count of T3.X is small • i.e.T3 is a small table or T4 predicate can filter T3xT4 to small size • Then I.2 has a hash table keyed on T3.X and I.2 is complete • Create an Artificial IN Clause on T1 to filter out rows that do not satisfy T1.X=T3.X T1.X=T3.X

Sort Merge Join • Sort on both sides then merge result sets • Used in all other cases • Optimised now (12.4) for 1:M (Primary or UNIQUE Join Key) • There is an exception in the case of M:M join, when there is a Loop against the small side

R1 I.1 I.2 T1 T2 T3 T4 Sort Merge Pushdown - 1 New • Similar to the Hash Pushdown But in this case there are too many rows from I2 to store in a Classic Hash • Instead we compute a hash table (as a bit vector) to report that 1 or more rows in I2 have the hash value • Also there is more than 1 key matching the hash number

R1 I.1 I.2 T1 T2 T3 T4 Sort Merge Pushdown - 2 New • We have a “Probably IN” Hash table • Push Down the Bit Vector to T1 • Now we know that if the hash # is NOT in the hash table then we know that the key value is NOT going to match at R1 • If the Key # IS in the hash table then it MAY match and should go up to I.1 and R1

Join Optimisation • Join_Optimization Def ON • Switching this off will set the optimiser to parse the join tree left to right – as the table were specified in the from clause • This may help an individual query – but generally this should only be used if you have drawn out the query tree and you are certain that the optimiser has misjudged the query

Join Preference • Join_Algorithm_Preference • 0 Let the optimiser choose - default • 1/-1 Prefer/Avoid Sort/Merge • 2/-2 Prefer/Avoid Nested Loop • 3/-3 Prefer/Avoid Nested Loop Pushdown • 4/-4 Prefer/Avoid Hash Join • 5/-5 Prefer/Avoid Hash Join Pushdown • 6/-6 Prefer/Avoid Pre-join Indexes • 7/-7 Prefer/Avoid Sort/Merge Pushdown

Extended_Join • This option controls the allowing of ambiguous syntax joins • With the ANSI join syntax there should be no ambiguity • With the “old fashioned” method of joining there could be ambiguity • If this option is ON (default) then the query will be thrown out with a syntax error • If not the query will be run using the plan that has the lowest cost

Join Algorithms - End

Join Algorithms

Join Algorithms

Presentation Transcript

Selected Topics: External Sorting, Join Algorithms, …

Join Algorithms for the Theory of Uninterpreted Functions

Join algorithms using mapreduce

Randomized Algorithms for optimizing large join queries

Join

Join

Join

Structural Join Algorithms

Improving Similarity Join Algorithms using Vertical Clustering Techniques

Structural Join Algorithms – Examples

MAXIMAL CLIQUES and JOIN TREE Algorithms

INNER JOIN (EQUI-JOIN) Predicate

Join

A Comparison of Join Algorithms for Log Processing in MapReduce

Inner Join vs. Outer Join

Join

A Comparison of Join Algorithms for Log Processing in MapReduce

A Comparison of Join Algorithms for Log Processing in MapReduce

Join Algorithms

Join algorithms using mapreduce

Randomized Algorithms for optimizing large join queries

Structural Join Algorithms – Examples