70 likes | 93 Views
RankSQL : Supporting Ranking Queries in RDBMS. Chengkai Li (UIUC) Mohamed A. Soliman (Univ. of Waterloo) Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (Univ. of Waterloo). Overview.
E N D
RankSQL:Supporting Ranking Queries in RDBMS Chengkai Li (UIUC) Mohamed A. Soliman (Univ. of Waterloo) Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (Univ. of Waterloo)
Overview • Ranking (top-k) is an important functionality in many real-world database applications: • E-Commerce, Web Sources • Multimedia Databases • Text Retrieval, Search Engine • OLAP, Decision Support • RankSQL: • Support ranking as a first-class query type in RDBMS; • Integrate rankingwith traditional Boolean query constructs.
Demo Query SELECT * FROM A,B,C WHERE A.j1=B.j1 and B.j2=C.j2 ORDER BY A.p1+B.p2+C.p3 desc LIMIT 10; membership dimension: Boolean predicates, Boolean function B order dimension: ranking predicates, monotonic scoring function R
RankSQL System • Extends PostgreSQL: • Query Engine: • Ranking Algebra • Execution engine of ranking operators • Ranking query optimizer • Front End: • Visualizing the enumeration and execution • Dataset: 3-table join, 100,000 tuples/table, key-foreign key join
10 sort 99720 join 100000 100000 10 rank-join 289 7422 ranking ranking 100000 100000 The Differences and Insights • Differences • Traditional Plan: ~7sec materialize-then-sort • New Plan: ~0.8sec Ranking is split and interleaved with Boolean query constructs. • Why? Early ranking enables • Reduced Boolean effort: cuts intermediate results for Boolean join/filter; • Reduced Ranking Effort: expensive ranking predicates can be optimized as well.
RankSQL [SIGMOD 05] • Rank-Relational Algebra As the foundation, support splitting and interleaving at the algebra level. • Two-Dimensional Enumeration ranking (ranking predicate scheduling) and filtering (join order selection)
Welcome to our Demo Group 8 Wednesday 2pm-3:30pm Friday 9am-10:30am