430 likes | 446 Views
Relaxing Join Selection & Queries Nick Koudas et al. Presented by Pradnya Chavarkar. Motivation . Complex queries in terms of predicates against large databases. Right parameter values not known. Parameter adjustment becomes a trial-and-error method.
E N D
Relaxing Join Selection & Queries Nick Koudas et al Presented by Pradnya Chavarkar
Motivation • Complex queries in terms of predicates against large databases. • Right parameter values not known. • Parameter adjustment becomes a trial-and-error method. • More the predicates, difficult the adjustment Relaxing Join and Selection Queries
Example Relaxing Join and Selection Queries
Example • How many conditions to adjust?? • How much to adjust?? • No. of choices exponential in number of conditions Relaxing Join and Selection Queries
Outline • Relaxation framework • Relaxing select conditions • Relaxing All conditions • Lattice traversal • Experimental results • Conclusion Relaxing Join and Selection Queries
Query Relaxation • Top–K queries Weights given to each condition • Skyline Points which are not dominated by any other points in the given set Relaxing Join and Selection Queries
Relaxation Framework Selection Condition • Join Condition Relaxing Join and Selection Queries
Relaxation Skyline Relaxing Join and Selection Queries
Simple Solution ! • Compute Skyline on Jobs • Compute Skyline on Candidates • Join the skylines Relaxing Join and Selection Queries
Simple Solution ! (Contd) • Incorrect results ! • Computes skyline locally Relaxing Join and Selection Queries
Relaxing Selection Conditions • Many joins are on Identifier attributes • Relaxation skyline should satisfy : Relaxing Join and Selection Queries
Join First (JF) • Join without select conditions • Compute skyline for select conditions on the resulting tuples • May not be efficient for joins which return a large number of pairs Relaxing Join and Selection Queries
Pruning Join (PJ) • Compute skyline during join step • Assuming index on relation S • For each tuple of R, find joining tuples from S • Call ‘update’ for each such pair and current skyline • Discard the pair if it is already dominated else discard the pairs dominated by the current pair Relaxing Join and Selection Queries
Pruning Join (PJ) (Contd.) • Advantage: Produces less pairs after join step as compared to Join First • Disadvantage Need for modification of join methods Relaxing Join and Selection Queries
Pruning Join+ (PJ+) • Modifies pruning Join Algorithm • Computes “local skylines” for records joining with a record of othe relation • Does Dominance checking within local skyline • If a pair is locally dominated then it will not be in global skyline Relaxing Join and Selection Queries
Pruning Join+ (PJ+) (Contd) R (A, B, C)S (C, D, E) Relaxing Join and Selection Queries
Pruning Join+ (PJ+) (Contd) • Does local pruning hoping to eliminate some S tuples beforehand • May not be always beneficial, when many tuples are not eliminated • Experimental results show that the tradeoff depends upon factors like number of conditions Relaxing Join and Selection Queries
Sorted Access Join (SAJ) • Constructs sorted list of tuple IDs for each select condition based on relaxation • Go through the lists in round robin fashion • For each record in Li find records which can join -- (p,q) • Stop if all the current records in list Lj (i≠j), have relaxation greater that (p,q) Relaxing Join and Selection Queries
Sorted Access Join (SAJ) (Contd) Relaxing Join and Selection Queries
Relaxing All Conditions • Relaxing join conditions along with selection conditions • Assumes multidimensional indexed structure on selection and join conditions • R-Tree on both relations Relaxing Join and Selection Queries
MIDIR • Traverses both R-Trees top down to identify potential pairs which can be in relaxation skyline • Push such pairs in queue • In each iteration, pop one pair and perform dominance checking • If object-object pair, call Update() Relaxing Join and Selection Queries
Dominance checking in MIDIR Compute lower and upper bound on relaxation Pair p’ dominates p if for each condition Ci • To compute the lower and upper bound for Ci (selection condition) convert itinto interval Relaxing Join and Selection Queries
Dominance checking in MIDIR (Contd) • B: MRB or tuple • A: attribute used in B • I(B,A): interval of A in B • Selection Condition: • Join Condition Relaxing Join and Selection Queries
Dominance checking in MIDIR (Contd) • Minimum distance between two intervals Then MINDIST = 0 else MINDIST = Relaxing Join and Selection Queries
Variants of Query Relaxation • Top-k answers • User can specify weights on the conditions • K points that have smallest weighted summation • Store the k best skyline answers • Modify Update() in PJ • Priority queues in MIDIR Relaxing Join and Selection Queries
Variants of Query Relaxation (Contd) • Queries with multiple joins • JF – compute all joins beforehand • SAJ – access multiple index structures to find the joining tuples • MIDIR – queue maintains possible vectors • Relaxation of nonnumeric attributes • RELAX(r,c) an be modified • All algorithms can refer to this computed value Relaxing Join and Selection Queries
Variants of Query Relaxation (Contd) Lattice Structure Relaxing Join and Selection Queries
Lattice Traversal • At least one result should be returned • Examine the reasons for the empty result – • Join condition Cj is strict • Relax Cj with the tuples provided by applying selection conditions • Either CR or Cs is strict • Relax the strict selection condition • If join still strict then relax join with either of the selections Relaxing Join and Selection Queries
Lattice Traversal (Contd) • Both CR or Cs are strict • Relax both selection conditions • If join condition still strict, relax join condition • If still no result, relax all togather Relaxing Join and Selection Queries
Experiments • Experimental Settings • Two real and three synthetic dabases • Three types of correlations • Independent • Correlated • Anti-correlated Relaxing Join and Selection Queries
Experiments (different data size) Relaxing Join and Selection Queries
Experiments (different data sizes) Relaxing Join and Selection Queries
Experiments (Different join cardinality) Relaxing Join and Selection Queries
Experiments (Different selection conditions) Relaxing Join and Selection Queries
Summary • JF and PJ • Almost same performance for relaxation of selection conditions • PJ and PJ+ • PJ works faster, but performance gets affected by join cardinality • SAJ • Efficient for correlated data and query conditions Relaxing Join and Selection Queries
Conclusion • Efficient relaxation algorithm for relaxing join and selection queries • Lattice traversal method for minimal relaxation Relaxing Join and Selection Queries
Related Work (ML for query relaxation ) • LOQR algorithm proposed which is for queries in disjunctive normal form • Three steps • Exacting domain knowledge • Finding the most useful rule • Relaxing the failing query Relaxing Join and Selection Queries
Introduction Let the query by the user be • The query fails because • Laptops with large screen weights more than 3 pounds • Fast laptops with large HDD cost more than $2000 Relaxing Join and Selection Queries
Extracting domain knowledge Take subset D of the target database Find “typical values” of the other attributes for each constraints Forms a new dataset Di, with values of the constraint attribute indicated a Boolean Relaxing Join and Selection Queries
Extracting domain knowledge (contd) Relaxing Join and Selection Queries
Finding the most useful rule • For rules formed using all the constraints individually, find the most similar rule by nearest neighbour search • R1 is the most similar as they differ only on the price attribute • It is guaranteed to give a result as it gives a result on the subset of the target database Relaxing Join and Selection Queries
Relaxing the failing query Most similar query is combined with the original query to give the relaxed query Relaxing Join and Selection Queries
References • Relaxing Join and selection Queries Nick Koudas, Chen Li, and Anthony K. H. Tung, Rares Vernica • Machine learning for online query relaxation Ion Muslea Relaxing Join and Selection Queries