200 likes | 354 Views
Chengkai Li Kevin-Chen-Chuan Chang Ihab Ilyas Sumin Song. RankSQL: Query Algebra and Optimization for Relational Top- k Queries. Presented by: Mariam John CSE 6392 03/20/2006. Contents.
E N D
Chengkai Li Kevin-Chen-Chuan Chang Ihab Ilyas Sumin Song RankSQL:Query Algebra and Optimization for Relational Top- k Queries Presented by: Mariam John CSE 6392 03/20/2006
Contents • Introduction • RankSQL • Ranking Query Model • Rank-Relational Algebra • Ranking Query Plans:Execution Model • Conclusion
Introduction • Top-k queries provides only the top k query results according to a user-specified ranking function. • Most of the available solutions are in the middleware, or focus on specific operators and queries. • Top-k queries are not treated as first class query type in RDBMS. Relational algebra has no notion for ranking.
RankSQL • Provides seamless support and integration of top-k queries with the existing SQL query facility in RDBMS. • Supports ranking as a first-class database construct. • Extends relational algebra and query optimization.
Example of a Top-k Query • SELECT * FROM Hotel h, Restaurant r, Museum m WHERE c1 AND c2 AND c3 ORDER BY p1+p2+p3 LIMIT k c1: r.cuisine=Italian p1: cheap(h.price) c2: h.price+r.price<100 p2: close(h.addr,r.addr) c3: r.area=m.area p3: related(m.collection, “dinosaur”)
Rank Query Model Ranking Filtering • Rank relational query has 4 types of predicates: Filtering – Boolean-selection predicates Boolean-join predicates Ranking – rank-selection predicates rank-join predicates • Goal is to support rank relational queries efficiently.
Rank-Relational Query • Such queries add a ranking dimension to query processing and optimization. • Filtering restricts tuple “membership” by applying a Boolean function of Boolean selection or join predicates. • Ranking restricts “order” by applying a monotonic scoring function of ranking predicates.
Ranking as First-Class Construct • Support for ranking as a first class construct in RDBMS is lacking. • Relational algebra models Boolean filtering as a first class construct in query processing. • c1 is a selection over R, and c2 is a join condition over R * S
Filtering as a First-Class Construct • Algebra framework supports the following for Boolean filtering: - splitting - interleaving • Enable query optimization to transform from canonical form to efficient query plans.
Ranking as First-Class Construct • Algebraic support for optimization is lacking for ranking. • The sorting operator is ‘monolithic’. • It may be beneficial to evaluate ranking predicates one by one and interleave them with Boolean filtering.
Challenges • First, we must extend relational algebra to do the following: • Handle ranking • Define algebraic laws to handle equivalence transformation • Second, we need to generalize query optimization techniques to integrate the parallel dimensions of Boolean filtering and ranking.
Rank-Relational Algebra • Rank-Relation is a relation with its tuples scored and ordered accordingly • How do we rank a relation, given
Ranking principle • Maximum possible score of a tuple t, denoted by , is defined as: = if = 1 otherwise
Operators • Need to extend relational-algebra operators for manipulating rank-relations. • For supporting ranking as a first-class construct, define a new operator ‘μ’. • This new ‘rank’ operator should satisfy the two requirements: splitting and interleaving.
New Operator, μ • Extend relational algebra by adding a new rank operator, μ. What does mean? • Extend the original semantics of existing operators with rank-awareness, enabling interaction with the new rank operator. • Extend relational algebra such that it gives several equivalences relevant to ranking.
Ranking Query Plans: Execution Model • Extend the common execution model to handle rank query. • Operators incrementally output rank relations. • Query has an explicitly requested result size. • Key capability of a rank-aware operator is to decide if enough information has been obtained from its input tuples in order to incrementally produce the next ranked output tuple.
Conclusion • RankSQL is a system that provides a systematic framework to support efficient evaluation of top-k queries in RDBMS. • Extend relational algebra to make ranking a first-class construct. • Query execution model is extended to handle ranking query. • Rank-aware operators are selective and context-sensitive.