260 likes | 441 Views
Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries. Presented By: Muhammad Aamir Cheema Joint work with Xuemin Lin, Wenjie Zhang, Ying Zhang. University of New South Wales, Australia. Introduction. Nearest Neighbor Query
E N D
Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries Presented By:Muhammad Aamir Cheema Joint work with Xuemin Lin, Wenjie Zhang, Ying Zhang University of New South Wales, Australia
Introduction • Nearest Neighbor Query • Find the user that is closest to the query facility • Reverse k Nearest Neighbor Query (RkNN) • Find every user for which the query facility is one of the k closest facilities u2 u1 q f1 • Monochromatic RkNN • queries are also supported u3 f2 u4 f3 • Nearest Neighbor Query • Find the user that is closest to the query facility • Reverse Nearest Neighbor Query (RNN) • Find every user for which the query facility is the closest facility u3 is the nearest neighbor of q u1 and u2 are RNNs of q u1, u2 and u3 are R2NNs of q
Preliminaries Half-space Pruning [VLDB04] dist(u,f2) < dist(u,q) so u cannot be RNN The half-space that contains f2 can be pruned Filtering Repeat until there is no facility in the unpruned area Find a nearby facility in the unpruned space Prune by using the half-space Containment The users that are contained in the unpruned space are the candidates Verification For each candidate user u Verify it if no object is within range dist(u,q) RNN query u3 u u1 f3 f5 u4 q f2 f1 f4 u2
Preliminaries Half-space Pruning [VLDB04] the space that is contained by k half-spaces can be pruned Filtering Repeat until there is no facility in the unpruned area Find a nearby facility in the unpruned space Prune by using the half-space Containment The users that are in the unpruned space are the candidate objects Verification For each candidate user u Verify it if less than k facilities are within the range dist(u,q) R2NN query u u3 u1 f3 f5 q u4 f2 f1 f6 f4 u2
Preliminaries Filtering Repeat until there is no facility in the unpruned area Find a nearby facility in the unpruned space Prune by using the half-space Containment The users that are in the unpruned space are the candidate objects Verification For each candidate user u Verify it if less than k facilities are within the range dist(u,q) R2NN query f3 f5 q f2 f1 f6 f4 FINCH [VLDB08] approximates the unpruned region by a convex polygon
Preliminaries • Influence Zone Zk • An area such that for every p not in Zk, |Cp| ≥ k and for every p’ in Zk, |Cp’| < k • RkNN query • Return every user u for which |Cu| < k • OR every user u in Zk _ K =2 u3 u1 f3 f5 p u4 f2 q • Notations • For any point p, Cp denotes the circle centered at p with radius r = dist(p,q) • | Cp | denotes the number of facilities inside Cp f6 f4 p’ u2
Advantage Existing Algorithms Our Algorithm Pruning Pruning Prune the data space Compute influence zone Containment Containment Candidates = objects in the unpruned space Result = objects that are inside the influence zone Verification Verify each candidate object if q is one of its k nearest neighbors
Naïve Approach • For each facility f • Draw the half-space between f and q • Zk is the space that is pruned by at most k-1 half-spaces _ f5 f3 f5 f2 q f6 f4
Observation 1 • A facility f can be ignored if it lies outside Cp for every p inside current unpruned polygon f7 f3 • Intuition • If a facility f lies outside Cp, the half-space b/w • f and q cannot prune p • If f lies outside every Cp, the half-space of f cannot prune any p f5 q f2 f6 f4
Observation 2 • A facility f can be ignored if it lies outside Cp for every p on the boundary of current unpruned polygon f7 f3 • Intuition • For every p’ inside the polygon, there exists a p on the boundary such that Cp contains Cp’ f5 q f2 p p’ f6 f4
Observation 3 • A facility f can be ignored if it lies outside Cv for everyvertex of current unpruned polygon v f7 • Intuition • CA U CB contains CC f3 f5 p q f2 v’ C A f6 B f4 q
Observation 4 • A facility f can be ignored if it lies outside Cv for everyconvexvertex v of current unpruned polygon f7 f3 • Intuition • Any vertex v’ that is not a convex vertex lies inside the convex polygon and is not required to check for this reason f5 q f2 v’ f6 f4 The above pruning condition is tight
Algorithm • Initialize Zk as the data universe • Insert root of R-tree in heap • While heap is not empty • Deheap an entry e • If e cannot be pruned • If e is an intermediate node • Insert children of e in the heap • Else • Use the half-space of e to update Zk If e lies outside every Cv for every convex vertex of Zk then e can be pruned
Other highlights of our algorithm • Observations to efficiently prune certain entries • Efficient determination of convex vertices • Prove that influence zone is always a star-shaped polygon • Efficient containment checks are possible for star-shaped polygons
RkNN Processing • Static RkNN queries • Pruning Phase (compute Zk) • Containment Phase (return users inside Zk) • Continuous BichromaticRkNN queries • Compute Zk • The users that enter Zk become RkNNs and the users that leave it are no more the RkNNs
Theoretical Analysis • Area of Influence Zone = k/|F| • Number of RkNNs = |U|. k / |F| • IO cost of computing Zk= • IO cost of RkNN queries • = IO cost of computing Zk+ • r = • S = number of facilities (i.e., |F|) • f = fanout of the R-tree • r = • S = number of users (i.e., |U|) • f = fanout of the R-tree
Experiments • Snapshot RkNN Queries • FINCH [1] (page size 4KB and number of buffers is 10) • Real dataset containing 175,812 locations in North America • Half of the randomly chosen points form the set of facilities • The remaining half form the set of users [1] W. Wu, F. Yang, C. Y. Chan, K. L. Tan. FINCH: Evaluating Reverse k Nearest Neighbor Queries on Location Data. VLDB 2008
Experiments • Snapshot RkNN Queries • The facilities are from the real data set • The users follow Normal distribution
Experiments • Verification of Theoretical Analysis • 100,000 facilities following Uniform distribution • 100,000 users following Uniform distribution
Experiments • Verification of Theoretical Analysis • 100,000 facilities following Uniform distribution • 100,000 users following Uniform distribution
Experiments • Continuous RkNN Queries • Moving objects and queries generated using Brinkhoff Generator [1] on road map of Texas • Data space is 1000 Km X 1000 Km • Our algorithm (InfZone) is compared with LazyUpdates [2] [1] T. Brinkhoff. A framework for generating network-based moving objects.GeoInformatica, 2002. [2] M. A. Cheema, X. Lin, Y. Zhang, W. Wang, W. Zhang. Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse kNN.PVLDB, 2009.
Experiments • Snapshot RkNN Queries
Experiments • Snapshot RkNN Queries