100 likes | 238 Views
RankFP : A Framework for Rank Formulation and Processing. Hwanjo Yu, Seung-won Hwang, Kevin Chen-Chuan Chang. The Context: AIMing to the Top. Enabling ad-hoc ranking in data retrieval. query. top-3 houses. select * from houses order by [ranking function F] limit 3. Rank Formulation.
E N D
RankFP: A Framework for Rank Formulation and Processing Hwanjo Yu, Seung-won Hwang, Kevin Chen-Chuan Chang
The Context: AIMing to the Top Enabling ad-hoc ranking in data retrieval query top-3 houses select * from housesorder by [ranking function F]limit 3 Rank Formulation Rank Processing ranked results
Problem: Enabling Ad-hoc Ranking To enable ad-hoc ranking, we observe two major barriers: • Usability: Ranking should be “user-friendly”, for ordinary users to easily specify their ranking criteria • Efficiency: Ranking should be “DB-friendly” to be amenable to efficient processing • We propose a framework combining user-friendly formulation and DB-friendly processing.
Our Insight: Combining Usability and Efficiency We combine qualitative model for usability and quantitative model for efficiency • Qualitative model • Query condition is represented as a relative ordering of objects • User-friendly by alleviating user from specifying the absolute score on each object • Example > • Quantitative model • Query condition is represented as a mapping F of objects into absolute numerical scores • DB-friendly, by attaining the absolute score on each object • Example F( )=0.9 F( )=0.5
Our Solution: RankFP (RANK Formulation and Processing) For usability, we propose a qualitative formulation front-endwhich enables rank formulation by ordering samples For efficiency, we learn a quantitative ranking function F which is readily expressible using order by clause in SQL yes Over S: RF» R*? ranking R* over S Q: select * from housesorder by Flimit k ranking function no Function Learning: learn newF 5 4 3 F 2 1 ranked results processing of Q Sample Selection: generate new S sample S (unordered) Rank Processing Rank Formulation
Implementation PostgreSQL interface SVM Learner order by F sampled top results if RF » R*? Top-k results
Challenge: Unlike a conventional learning problem of classifying objects into groups, we need to learn a function inducing a desired ordering of all objects Solution: Transform ranking into a classification on pairwise differences [Herbrich2000] and adopt learning algorithms (e.g., SVM) to learn pairwise classification function F Task 1: Rank Formulation Front-end (RankingClassification) ranking view: c > b > d > e > a classification view: learning algorithms: a binary classifier pairwise diff. classification 1 c a - b 0 b 0 b - c d 1 c - d e a 1 d - e - 0 a - c F … … [Herbrich2000] R. Herbrich, et. al. Large margin rank boundary for ordinal regression. MIT Press, 2000.
Challenge: While the classification is for each pair of objects, we need to efficiently rank the entire database. Solution:We develop duality connecting a pairwise classification function F, also as a global per-object ranking function. Task 2: Rank Processing Back-end (ClassificationRanking) F(a-b)? F(a-c)? F(a-d)?….. • Suppose the rank function F is linearClassification View:Ranking View:F(ui-uj)>0 F(ui)- F(uj)>0 F(ui)> F(uj) b a • Rank with F(.)e.g., F(c)>F(b)>F(d)>… c e d Further: Optimization of Top-k Order-by [SIGMOD’05]
Conclusion: Summary To support ranking for data retrieval, we develop RankFP, an iterative learning and processing framework, combining: • Usability: Developing a learning front-end, which enables qualitative rank formulation • Efficiency: Transforming the classification to a global rank function for efficient processing
Thank You! For more information: The AIM Project: http://aim.cs.uiuc.edu