380 likes | 539 Views
TMSG. ROAD: A New Spatial Object Search Framework for Road Networks. Angus Fuming Huang 2012.04.18. Publication. OUTLINE. INTRODUCTION THE ROAD FRAMEWORK SINGLE-SOURCE LDSQ ALGORITHMS MULTISOURCE LDSQ ALGORITHMS PERFORMANCE EVALUATION CONCLUSION Angus Comments. Introduction.
E N D
TMSG ROAD: A New Spatial Object Search Framework for Road Networks Angus Fuming Huang 2012.04.18
OUTLINE • INTRODUCTION • THE ROAD FRAMEWORK • SINGLE-SOURCE LDSQ ALGORITHMS • MULTISOURCE LDSQ ALGORITHMS • PERFORMANCE EVALUATION • CONCLUSION • Angus Comments
Introduction • Location-based services are booming • Map and navigation services • Garmin, GoogleMap, MapQuest, NavTeq, YahooMap • Location-dependent information • spatial objects • Location-dependent spatial queries (LDSQs) • Queries that search for spatial objects with respect to user-specified locations • Q1: find hotels within one mile from the conference venue
Introduction (contd.) • Two basic operations for LDSQs processing • Network traversal • Visit network nodes/edges according to network proximity • Object lookup • Access and check the attributes of objects located at traversed nodes or edges • Network is modeled as a graph • Search space pruning
Introduction (contd.) • A network is formulated as a set of interconnected regional subnets called Rnet • Shortcuts • Selective paths across an Rnet that enable any traversal to bypass the Rnet if it has no object of interest • Object abstract • The existence and/or contents of objects that are inside the Rnets to provide quick traversal guidelines • Two novel index structures • Route Overlay • Manage the physical network structure and the shortcuts • Association Directory • Manipulate the mappings of objects and object abstracts on nodes, edges, and Rnets
THE ROAD FRAMEWORK • Preliminaries • Rnet, Shortcut and Object Abstract • Rnet Hierarchy • Route Overlay and Association Directory
Preliminaries • All LDSQs are assumed to be initiated at nodes without loss of generality • In general, each LDSQ is specified with a distance condition D and an attribute predicate
Route Overlay • Border nodes in an Rnet are always the border nodes in some of its child Rnets • Naturally flattens a hierarchical network into a plain structure to facilitate search space expansion over a network • B+-tree • All the shortcuts from border n to other border nodes are captured by nonleafentries • A leaf entry stores all the physical edges to its neighboring nodes
Association Directory • An efficient object lookup mechanism in ROAD • B+-tree • With unique node IDs or Rnet IDs as the search key • Associated with node n(n’) are objects o in L(n, n’) together with their distances δ(o,n)(δ(o,n’)) • Represent an object abstract with smaller storage overheads • Bloom filter[18], signature[19]
Association Directory • Object o1 on edge (nf, ng) • Both RnetR3b and its parent RnetR3 that contain objects o1 and o2 are associated with {o1, o2}
SINGLE-SOURCE LDSQ ALGORITHMS • NN query: n2 • Objects: o1, o2 • Border nodes: n3, n5, n7, n9, n11 • Since R3b contains objects, a traversal within R3b is needed • The search only takes three jumps from n3 to n11 1 2 3 4 5 6
SINGLE-SOURCE LDSQ ALGORITHMS • Range query, kNN query • kNNSearch [17] • ChoosePath [17] • Quickly identify appropriate shortcuts and edges to expand the search range from a node n • Depth-first traversal order • If n is a border node, the shortcut tree must have multiple levels • RangeSearch • Resembles kNNSearch algorithm except ends when a portion of the network within the distance bound
MULTISOURCE LDSQ ALGORITHMS • Concurrent Network Expansion • Rnet Visited Set and Border Node Visited Set • Search Algorithm • A multisource LDSQ finds objects with respect to m query nodes • A multisource kNN query finds k objects whose maximum distances from all query nodes are the minimum
Concurrent Network Expansion • Adopt a concurrent approach that expands a search space from all query nodes through a best-first strategy • According to Lemma 2 and 3, the k first visited objects are guaranteed to be the answer objects
Rnet Visited Set and Border Node Visited Set • Two subqueries: q1, q2 • Result object: ob • !!! The oaand oc will be traversed first • An Rnet is worth exploring only if it contains objects of interest and it is reached by all subqueries • Rnet visited set (RV), Border node visited set (BV)
Search Algorithm • MultiSourcekNNSearch • Every entry (ε,d,qi) in Priority Queue P records a node or an object (ε), its distance from nqi(d) and the respective subquery (qi) • An entry (R,qi) in RV indicates that RnetR has been visited by subqueryqi • An entry (R,b,d,qi) in BV records that a subqueryqi has reached RnetR via the border node b, and d=||b,nqi||
To mark nodes “unvisited by qi” To repeatedly evaluates the head entry from P has been visited a detailed examination begins associated objects are fetched from AD and enqueuedto P for later exam. To check the backtracking by BV To resume the traversal at the border nodes To expand the search range
Visit node n’s shortcut tree in a depth-first order and identify appropriate shortcuts and edges to expand search range If R is.. To bypass R
RV: (R3a, q2), (R3b, q2), (R3, q2) RV: (R1a, q1), (R1, q1) BV: (R3a, n11, 0, q2), (R3b, n11, 0, q2) BV: (R2b, n9, 2, q2) BV: (R2a, n5, 3, q1)
PERFORMANCE EVALUATION • Index Construction • Query Performance • Experiments on Single-Source kNN Query • Experiments on Single-Source Range Query • Experiments on Multisource kNN Query • Experiments on Multisource Range Query • Index Update • Evaluation on p and l
PERFORMANCE EVALUATION • Data set • CA, NA highways in California and North America • SF, PRS streets and roads in San Francisco and Paris • 100 to 100000 objects • Comparison • NetExp (network expansion [7]) • Euclidean (euclidean distance bound approach [8]) • DistIdx (distance index [6]) • DistBrws (distance browsing [13]) • Performance metrics • Index construction time • Index size • Query processing time • Index update time • Evaluation parameters
Index Construction • Object numbers vs. Index construction time (hours) & Index sizes (megabyte) • NA highway • NetExp& Euclidean incur the smallest index construction times and index size • DistBrwstakes an extremely long time and huge storage • ROAD takes around 1 hour and 20 MB • The ideas of query precomputation and materialization of shortest paths between nodes or toward objects are not appealing
Index Construction • Different networks, 10000 objects, 100 Rnets • NetExpand Euclidean incur the shortest index time and size • DistIdx, DistBrwsand ROAD incur different index time and size, but ROAD is the best • DistBrwstakes over a month to build the index and more than 15GB • ROAD incurs significantly shorter time and size
Experiments on Single-Source kNN Query • (a) Euclidean performs the worst because of exhaustive shortest path searches for a possibly large number of candidate objects • (a) DistBrwsand DistIdxperform worse due to the excessive accesses to distance signatures and shortest path quad-trees and slow node-by-node network traversals • (b) ROAD only requires 33% of NetExp’s processing time when 100000 objects are evaluated • (c) When 10 clusters, ROAD takes only 1 percent processing time of NetExp • (d) When k is increased, ROAD consistently performs the best due to its strong pruning power
Experiments on Single-Source Range Query • ROAD consistently outperforms all the others and it benefits more from a larger network • Euclidean performs the worst as it has to examine a large number of candidate objects • DistBrwsand DistIdxdo not improve the search performance since they both suffer from the massive access overhead for large networks and large numbers of objects
Experiments on Multisource kNN Query • (a, b, c, d): two-source kNN queries • (e): multisource NN queries • DistIdxand DistBrwsdo not support multisource LDSQs • NetExpperforms worse due to exploring all the subnetworks around query points • Euclidean has to invoke multiple network traversals to determine the network distances of candidate objects
Experiments on Multisource Range Query • This is because range queries request to explore all the nodes/edges within the search range, that is independent of the number of objects. • As the search range is fixed, the search performance does not change even when the number of objects varies • Euclidean performs the worst due to exhaustive candidate object distance searches
Index Update • The update cost incurred by DistIdxis several orders of magnitude higher than that of others • The edge change has almost unobservable impacts on NetExpand Euclidean • For DistIdx, the distance signatures of many nodes need reexamination and update, resulting in large processing times • ROAD only needs to update affected shortcuts of certain border nodes of Rnets
Evaluation on p and l • p: child Rnet number • l: level number • Single-source kNN queries (k=10) • ROAD performs similarly in terms of query processing times under different <p,l> pairs • A smaller l results in a smaller index and a shorter construction time • So~ smaller l and larger p is better !
CONCLUSION • The on-going trend of web-based LBSs… • To accommodate diverse objects • To support different distance metrics • To process various LDSQs efficiently • ROAD • A clear separation between objects and network for better system extensibility • Exploit search space pruning • Support single-source and multisource LDSQs • In the future… • To support Continuous queries, Skyline queries and Optimal location queries
Angus Comments • Hierarchical road nets concept • Comprehensive experiments