270 likes | 522 Views
Continuous k-dominant Skyline Query Processing. Presented by Prasad Sriram Nilu Thakur. Outline. Introduction Problem definition Key Concepts Validation Rewrite Today. Example Skyline. Finding skyline of hotel, lesser price & closer to the beach. Which one is better?
E N D
Continuous k-dominant Skyline Query Processing Presentedby Prasad Sriram Nilu Thakur
Outline • Introduction • Problem definition • Key Concepts • Validation • Rewrite Today
Example Skyline Finding skyline of hotel, lesser price & closer to the beach • Which one is better? • e or b? (e, because its price and distance dominate those of b) • C or f? Price a 200 150 100 50 d b e f c Distance 1 2 3 4
Problem Definition Input A set of points, p1,p2,…pn Output A set of points P (referred to as the skyline points), such that any point p1 Є P is not dominated by any other point in the dataset Objective Provide correct and complete results Minimize the query response time and memory consumption Continuous queries require continuous evaluation Scalability in terms of the number of queries Constraints Minimize the number of dominance checks
Skyline Properties (1/2) • Meaningful for incomparable dimensions • Browsing Laptops • Price, weight, size, memory, etc. • Insensitive to scaling and shifting of the dimensions • Skyline - Curse of Dimensionality • Movie Rating • Different users may have different rating preferences • Movie p better than q only if p rated higher or equal to q by all users • One outlier opinion will invalidate the dominance
Skyline Properties (2/2) • Too many skyline points in high dimensional spaces • Example: NBA data set, 17000 player season statistics on 17 attributes • Over 1000 skyline points in the full space • Some average-skilled players are in the skyline if they are not bad on some attributes. • Possible Solutions • Dimension Reduction Techniques - Requires domain knowledge • Subspace Skylines - Many subspaces need to be explored • Relax the notion of d-dominance - k-dominance
k-dominant Skyline • k-Dominate • If A is not worse than B on k dimensions, and better on at least one of the k dimensions, we say A k-dominates B. • k-Dominant Skyline • k-dominant skyline contains all the points that cannot be k-dominated by any other point • k-Dominant Skyline Query • Given a data set, find the k-dominant skyline • When k=d, we have the conventional skyline • K-dominance is cyclic unlike d-dominance
k-dominant Skyline - Example conventional skyline 5-dominant skyline 4-dominant skyline Smaller k, smaller k-dominant skyline Slide Courtesy [2]
Cyclic Properties of k-dominance • k-dominance can be cyclic • A 3-dominates B
Cyclic Properties of k-dominance • B 3-dominates C
Cyclic Properties of k-dominance • C 3-dominates D
Cyclic Properties of k-dominance • D 3-dominates A
A naïve approach • Case 1 • A new point arrives • It is k-dominated by some points • It k-dominates some points • Case 2 • A point expires
An improved approach a(1) b(3) c(5) d(7) e(9) f(11) g(13) Skyline heap Non-Skyline heap
An improved approach h(15) a(1) h(26) b(3) c(5) d(7) e(9) f(11) g(13) Skyline heap Non-Skyline heap
An improved approach at t = 16 b(3) h(26) d(7) c(5) e(9) f(11) g(13) Skyline heap Non-Skyline heap
An improved approach i(17) b(3) h(26) d(7) c(5) i(20) e(9) f(11) g(13) Skyline heap Non-Skyline heap
An improved approach at t = 18 c(5) i(20) d(7) f(11) e(9) g(13) Skyline heap Non-Skyline heap
An improved approach j(19) c(5) i(20) d(7) f(11) e(9) g(13) Skyline heap Non-Skyline heap
An improved approach c(5) i(20) j(32) d(7) f(11) e(9) g(13) Skyline heap Non-Skyline heap
Validations Methodology • Theorem based proving for correctness and completeness • Experiments to analyze performance Validation criteria • Query Response time
Rewrite today Improvements • A better technique for k-dominance • Conduct detailed experiments with network object generators • Think about how to find (spatial) skyline in road networks
References • Yufei Tao, Dimitris Papadias: Maintaining Sliding Window Skylines on Data Streams. IEEE Trans. Knowl. Data Eng. 18(2): 377-391 (2006) • Chee Yong Chan, H. V. Jagadish, Kian-Lee Tan, Anthony K. H. Tung, Zhenjie Zhang: Finding k-dominant skylines in high dimensional space. SIGMOD Conference 2006: 503-514. • M. Sharifzadeh, C. Shahabi. The Spatial Skyline Queries. In Proceedings of VLDB’06. • Michael D. Morse, Jignesh M. Patel, William I. Grosky: Efficient Continuous Skyline Computation. ICDE 2006: 108. • Zhiyong Huang, Hua Lu, Beng Chin Ooi, Anthony K.H. Tung, Continuous Skyline Queries for Moving Objects, IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 12, pp. 1645-1658, Dec., 2006. • S. Borzsonyi, D. Kossmann, and K. Stocker. The Skyline Operator. In Proceedings of ICDE'01. • D. Kossmann, F. Ramsak, and S. Rost. Shooting Stars in the Sky: An Online Algorithm for Skyline Queries. In Proceedings of VLDB'02.