1 / 35

View Usability and Safety for the Answering of Top- k Queries via Materialized Views

University of Ioannina Dept. of Computer Science. View Usability and Safety for the Answering of Top- k Queries via Materialized Views. Eftychia Baikousi Panos Vassiliadis. Forecast. Problem of answering a top- k query through materialized top- n views

lovey
Download Presentation

View Usability and Safety for the Answering of Top- k Queries via Materialized Views

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. University of Ioannina Dept. of Computer Science View Usability and Safety for the Answering of Top-k Queries via Materialized Views Eftychia Baikousi Panos Vassiliadis

  2. Forecast • Problem of answering a top-k query through materialized top-n views • Theoretical guarantees when a top-n materialized view can answer a top-k query • Algorithmic techniquesfor answering a top-k query from a materialized view • Properties of the safe areas of views DOLAP 2009, Hong Kong, 6 Nov 2009

  3. Contents • Motivation & Problem Definition • Overview of the Method • Theoretical guarantees • Strictness of theorem • Safe area properties • Experiments • Conclusions • Future extensions DOLAP 2009, Hong Kong, 6 Nov 2009

  4. Contents • Motivation & Problem Definition • Overview of the Method • Theoretical guarantees • Strictness of theorem • Safe area properties • Experiments • Conclusions • Future extensions DOLAP 2009, Hong Kong, 6 Nov 2009

  5. Top-k query Given • a relation R (id, x1, x2, x3) and • a query Q, sum(x1, x2, x3) Findk tuples with highest grades according to Q R Top-2 tuples DOLAP 2009, Hong Kong, 6 Nov 2009

  6. Motivating Example Telecommunication Company • Executives see sale reports in PDAs • Given a relation • Region (id, name, today_traffic, yesterday_traffic, budget, ..) • a materialized view V of top-2 regions according to the query Q: 0.6*difftraffic + 0.4*budget V Region • Can a new top-k query (e.g. 0.5*difftraffic + 0.3*budget) be answered from V ? DOLAP 2009, Hong Kong, 6 Nov 2009

  7. Problem definition • Given • a base relation R(ID, X, Y) • a materialized view V(ID, X, Y, s) that contains top-n tuples of the form(id, s) where s is defined as s = w (a·x + y) and w, a are positive parameters • a query Q (ID, X, Y, sQ) that requests for top k≤n tuples of the form (id, sQ) where sQ is defined as sQ = wQ (aQ·x + y) and wQ, aQ are positive parameters • Introduce • an algorithm that decides whether V by itself is suitable to answer Q and compute Q’s answer DOLAP 2009, Hong Kong, 6 Nov 2009

  8. Related Work Gautam Das, Dimitrios Gunopulos, Nick Koudas, Dimitris Tsirogiannis : “Answering Top-k Queries Using Views”, VLDB ’06 • Answer top-k queryQby making use of ranking views V • LPTA in 2-steps • SelectViews (V, Q) • Selects efficient subset of views U for answering Q, • U contains the sorted lists over each attribute of the relation • AnswerQ from U • Linear programming adaptation of TA algorithm • Stopping condition : solution of linear program ≤ min (top-k) DOLAP 2009, Hong Kong, 6 Nov 2009

  9. Related Work –Geometric Representation (0) • Assume • Relation R (ID, X, Y) • Two views Vu( id, Score1) and Vd( id, Score2) • Query Q( id, Score) • Scoring functions of the form Score = w ( a·x +y) • Depicted as y = a-1·x DOLAP 2009, Hong Kong, 6 Nov 2009

  10. Related Work – Geometric Representation (1) • M : the kth tuple in Q • Stopping condition: sweeping line ( ) crosses position A1B • Any point below line AB has smaller score than M in regards to Q DOLAP 2009, Hong Kong, 6 Nov 2009

  11. Related Work – Geometric Representation (2) • Stopping condition: intersection point S of sweeping lines ( , ) lies on line AB • Any point below line AB has smaller score than M in regards to Q DOLAP 2009, Hong Kong, 6 Nov 2009

  12. Related Work • SelectViews (V,Q)is Data dependant • based on estimation of the last tuple of Q according to the data distribution • No theoretically established guarantees that the set of views will answer Q DOLAP 2009, Hong Kong, 6 Nov 2009

  13. Contents • Motivation & Problem Definition • Overview of the Method • Theoretical guarantees • Strictness of theorem • Safe area properties • Experiments • Conclusions • Future extensions DOLAP 2009, Hong Kong, 6 Nov 2009

  14. Overview of the method • Theoretical guarantees of Answering a query Q via a view VU • Theoretical guarantees are too strict • Parallelism of safe areas DOLAP 2009, Hong Kong, 6 Nov 2009

  15. R Example • V top-3 with score x+2y • Q top-1 with score 2x+y DOLAP 2009, Hong Kong, 6 Nov 2009

  16. Construction of safe area • VU(ID, X, Y, sU) • Containing topn tuples • with score sU=wU(aU·x+y) • tN the nth tuple in VU • LU :xNUyNU line perpendicular to VUpassing from tN and meeting axes X and Y • LQ:xNUyQ line perpendicular to Q passing from xNU DOLAP 2009, Hong Kong, 6 Nov 2009

  17. Safe area • Safe area defined as the area “above” line LQ(shaded area) • Observations • Any tuple in safe area has score (in regards to Q) higher than any tuple outside the safe area • Tuples in safe area belong in both VUand Q DOLAP 2009, Hong Kong, 6 Nov 2009

  18. Answering Q from VU • THEOREM 1 VUcan answer Q if safe area contains at least k tuples • Inverse does not always hold DOLAP 2009, Hong Kong, 6 Nov 2009

  19. Overview of the method • Theoretical guarantees of Answering a query Q via a view VU • Theoretical guarantees are too strict • Parallelism of safe areas DOLAP 2009, Hong Kong, 6 Nov 2009

  20. Answering Q from VU cont. • THEOREM 2 It is possible that VUcan answer Qif safe area contains less than k tuples • This holds when: areadefined by (yellow triangle) • lineLU, X-axisand • lineL1 producingthe lowest possible score for Qfrom tuples of VU Is void of tuples DOLAP 2009, Hong Kong, 6 Nov 2009

  21. Algorithm TestViewSuitability • Three main steps • Step 1: Computesafearea (Q, V) • Step 2: Count tuples inVthat belong in thesafearea • Step 3: If there are more thank,then return (true) Else return(false) DOLAP 2009, Hong Kong, 6 Nov 2009

  22. Overview of the method • Theoretical guarantees of Answering a query Q via a view VU • Theoretical guarantees are too strict • Parallelism of safe areas DOLAP 2009, Hong Kong, 6 Nov 2009

  23. Combining two views • Lines LQU , LQDQ • characterizing the safe areas for VU and VD • LQU║LQD • safe area of one view (VU ) encompassed in safe area of the other view (VD) DOLAP 2009, Hong Kong, 6 Nov 2009

  24. Contents • Motivation & Problem Definition • Overview of the Method • Theoretical guarantees • Strictness of theorem • Safe area properties • Experiments • Conclusions • Future extensions DOLAP 2009, Hong Kong, 6 Nov 2009

  25. Experimental methodology • Test the following methods • Our algorithm • TA algorithm (it can guarantee view usability correctness) • For the following goals • Effectiveness • Number of queries answered by views • Efficiency • Time savings from usage of queries DOLAP 2009, Hong Kong, 6 Nov 2009

  26. Experimental methodology • Experimental parameters: • Synthetic data sets: • Random data sets of different sizes for a relation of the form R (ID, X, Y) • Sequence of queries with random coefficients and result size k DOLAP 2009, Hong Kong, 6 Nov 2009

  27. Effectiveness • Percentage of views used for 100 queries DOLAP 2009, Hong Kong, 6 Nov 2009

  28. Effectiveness • Percentage of views used for different time spans DOLAP 2009, Hong Kong, 6 Nov 2009

  29. Efficiency • Time savings from the usage of queries for different database sizes and requested results • Conflicting case • The number of stored results rises, while the savings drop • Due to the size of used memory • Memory allocation becomes slow • Probably one view is able to answer lot of queries • Savings increase for reasonable k’s of size 0.1% DOLAP 2009, Hong Kong, 6 Nov 2009

  30. Contents • Motivation & Problem Definition • Overview of the Method • Theoretical guarantees • Strictness of theorem • Safe area properties • Experiments • Conclusions • Future extensions DOLAP 2009, Hong Kong, 6 Nov 2009

  31. Conclusions • We have provided theoretical and algorithmic results for the problem of answering top-kqueries via materialized views • Theoretical – algorithmic results: • Theorem1: Theoretical guarantees for a view to answer a top-k query, • Theorem2: Strictness of Theorem1 • Parallelism of safe areas DOLAP 2009, Hong Kong, 6 Nov 2009

  32. Contents • Motivation & Problem Definition • Overview of the Method • Theoretical guarantees • Strictness of theorem • Safe area properties • Experiments • Conclusions • Future extensions DOLAP 2009, Hong Kong, 6 Nov 2009

  33. Future Work • Optimization in case of time and storage constraints • View Caching • Hierarchical structures for the set of views • Sorting techniques DOLAP 2009, Hong Kong, 6 Nov 2009

  34. Thank you for your attention! … many thanks to our hosts! DOLAP 2009, Hong Kong, 6 Nov 2009

  35. Auxiliary Time Savings DOLAP 2009, Hong Kong, 6 Nov 2009

More Related