150 likes | 283 Views
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks. Jie Bao Chi-Yin Chow Mohamed F. Mokbel Department of Computer Science and Engineering University of Minnesota – Twin Cities Wei-Shinn Ku Department of Computer Science and Software Engineering Auburn University.
E N D
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks Jie Bao Chi-Yin Chow Mohamed F. Mokbel Department of Computer Science and Engineering University of Minnesota – Twin Cities Wei-Shinn Ku Department of Computer Science and Software Engineering Auburn University
What is Range NN Queries Region • k-Range NN Queries in Euclidean Space • Given a spatial region, find the k nearest objects to every points within the region • E.g., Find the nearest hotelto a shopping mall • k-Range NN Queries in Road Networks • Given a set of road segments, find the k nearest objects to every points on the road segments
Usages of Range NN Queries • Uncertain locations • Measurement imprecision - due to the limitation of the underlying positioning techniques, e.g., 2G/3G and Wi-Fi • Sampling imprecision - due to continuous motion, network delays, and location update frequency iPhone's 3G Positioning • Privacy-preserving queries • Users do not want to reveal their exact location information to service providers • Their locations are blurred into spatial areas 5-Anonymous Area
Related Works for k-RNN Queries • K-Nearest Neighbor in Road Networks • Query processing with pre-computed information Incremental Network Expansion (INE): a best first expansion over the road networks [Papadias et al., VLDB 2003] • Query processing with pre-computed information Use extra pre-computed quad-tree indexes to calculate the distances[Samet et al., SIGMOD 2008] • K-Range Nearest Neighbor in Euclidean Space • Pre-computed Voironi Diagrams [Chow et al., SSTD 2009] • K-Range Nearest Neighbor in Road Networks • Range Query + INE for every boundary node [Wang and Liu, PVLDB 2009]
Motivating Example • Computational redundancy in the existing solution • Range Query + Multiple kNN Queries [Wang and Liu, PVLDB 2009] Total number of road segments searched: 3 + 2 + 5 + 6 = 17 Total number of the road segments in the map: 6 Redundancy ratio: (17 - 6) / 6 = 183% (Worse if more boundary points) • Can we provide the results without the computational redundancy? k-NN for B k-NN for D Range Search k-NN for F
Problem Definition • Given: • A undirected graph G=(V, E) as road networks • Set of objects O • A query region R (a set of road segments) • A K value • Find: • Answer set Afrom Osuch that A contains the K-nearest objects of every point in R based on the network distance in G • Objective: • Provide A without computational redundancy
Efficient k-RNN Query Processing • Step 1: Inside Query Step • Step 2: Outside Network Expansion Step • Multiple searching queues • Stop after closest node is searched • Switch to the queue with the smallest searched distance • Termination condition: covers the distance of its kthobject Example 2-RNN B A P1 P2 C P3 Road Segment Set (Range) 1st iteration Search from A Answer Set P1, P2 2nd iteration Search from B Answer Set P1, P2 3rd iteration Search from C Answer Set P1, P2 4th iteration Search from C Answer Set P1, P2, P3 5th iteration Search from B Answer Set P1, P2, P3
DistanceCalculation • Case 1: By a pre-computed shortest path table • Fast but more storage • Case 2: Calculation on the fly • Keep the distance information as the searching expands • Tradeoff between storage and speed Search collision!
Experimental Results • Evaluate our algorithm without pre-computed results (KRNN-E), with pre-computed results (KRNN-F) • Baseline algorithm: [Wang and Liu, PVLDB 2009] • Road networks (Hennepin county, Minnesota, US) • 39,513 nodes and 54,444 road segments Parameter settings
Comparison with baseline(1/2) • Impact of different kvalues • Impact of different total objects on the map • Impact of different query region size
Comparison with baseline(2/2) • Impact of different distribution of the data objects • Uniform distribution • Normal distribution • SD is the standard deviation to simulate the hot spot locations like downtown area
Tradeoff between storage and performance • Tuning parameter P • The percentage of the shortest distance table • Warm up process with 1000 k-RNN queries • Full size of the table is 980 MB
Conclusion • An efficient algorithm for k-Range Nearest Neighbor (k-RNN) queries in road networks without computational overhead • Experiment evaluation • Our solution outperforms the baseline algorithm • Tuning parameter P achieves a tradeoff Uncertain locations Privacy preserved applications