1 / 24

Distance Indexing on Road Networks

Explore how Distance Signatures enhance query processing efficiency on road networks by categorizing distances, supporting various query types, and reducing maintenance costs. Learn about distance operations, smart distance categories, and their construction for better results.

strattona
Download Presentation

Distance Indexing on Road Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distance Indexing on Road Networks

  2. objects query point Modeling Road Networks • Network -> Undirected weighted graph • Road junction -> Vertex (node) • Road segment -> Edge • Distance -> Edge weight • Data object and query point -> On node only

  3. Query Processing on Road Networks • Queries: • Window query • kNN, continuous kNN • Processing methods: • Network Expansion [Papadias VLDB03] • Use Euclidean distance for preliminary pruning • Indexing the objects byspatial index • Precomputed Index [Kolahdouzan VLDB04] • Voronoi Network Nearest Neighbor (VN3) • NN list: precompute and store the kNNs for some large-degree nodes 5

  4. Problems and Disadvantages • Distance computation is still tough • By Dijkstra's single-source shortest path algorithm: • Maintain nodes whose distances are not finalized • Pick the node with the shortest distance and finalize it • Relax all not-yet-finalized distances • Repeat until all distances are finalized • Limitations: • Must visit nodes in the ascending order of distances • Running time O(NlgV) • Precomputed indexes cannot suit all queries • Return k nearest neighbor • Return the actual shortest path • Precomputed indexes are costly to store and update

  5. Our Solution at a Glance • Distance signature --- the first general-purposed index on road networks that • Categorizes the distances of a node to all objects • Supports both rough and exact distance computation • Accelerates processing of common query types • Reduces the storage and maintenance cost • Is orthogonal to other query optimization techniques

  6. Roadmap • Background • Distance Signature Overview • Operations on Signatures • Query Processing on Signatures • Smart Choice of Distance Categories • Construction and Maintenance • Experimental Results • Conclusion

  7. 3 6 12 24 Cat 0 Cat 1 Cat 2 Cat 3 Distance Signature • Basic Idea: • Precomputing distances is a good trade-off between having no indexing and solution space indexing • Maintain the approximate distance between objects and nodes • How rough is the approximation? • Apply rough approximation to faraway objects • Queries are always interested in local objects • Faraway objects are more than local objects • We use an exponential sequence of categories • In the form of [0, T), [T, cT), [cT, c2T), [c2T, c3T), ... • T and c are constant parameters • E.g., T = 3, c = 2, then [0, 3), [3,6), [6,12), [12,24), ...

  8. Distance Signature (Cont'd) • For each node n, signature component S(n)[i] denotes the category of dist(n,i) • S(n)[i].link denotes the next node from n in the shortest path to i • Signature S(n) is the whole set of components S(n)[i]

  9. Roadmap • Background • Distance Signature Overview • Operations on Signatures • Query Processing on Signatures • Smart Choice of Distance Categories • Construction and Maintenance • Experimental Results • Conclusion

  10. Principle: trace back the link until the distance range is accurate enough p1 n3 p1p2: possible positions of n4 11 4 n2 n6 11 p2 Distance Operations on Signatures

  11. Approximate Distance Comparison • What and Why? • Compare the distances of two objects based on one signature • Avoid accessing the signatures of other nodes • Used to get a rough result of distance sorting • How? • Example: compare dist(n4,n2) with dist(n4,n6) • Select an observer n3 • Embed objects n2,n3,n6into Euclidean space • n3 tells if n2 or n6 is closer to n4 • If n4 is on the perpendicularbisector, is it possible for n3to find n4 within distance ranges(n4)[n3]? • Let multiple observers vote

  12. kNN Search on Signatures • Procedures • Read signature s(q) of query node q • Categories tell the approximate distances between q and other objects • Get k closest objects according to their category values • If no need to know the distances or order, return objects based on category ranges • To find the ordering: • Sort objects within each category • To find exact distances: • Perform exact distance retrieval for each knn

  13. Roadmap • Background • Distance Signature Overview • Operations on Signatures • Query Processing on Signatures • Smart Choice of Distance Categories • Construction and Maintenance • Experimental Results • Conclusion

  14. Smart Choice of Distance Categories • Exponential categories [0, T), [T, cT), [cT, c2T], ... • How to determine c and T? • Factors: • Dataset density, distribution • Query type, load (metric: spreading) • Storage availability • Simplifications • The road network is a uniform grid • Spreading is uniformly distributed in [0, SP] • Unlimited disk storage • Theorem • The optimal c = e, T = (SP/e)0.5

  15. Signature Construction • Basic procedures • Allocate storage for signatures • Build shortest path spanning tree for each object (Dijkstra) • Fill in s(n)[i] when the tree of object i is spanned to node n • Variable length encoding • Observation • the number of objects in each category is not even • # of objects 1 unit, 2 units, 3 units, ... away: 4, 8, 12, ... • Use fewer bits for larger categories

  16. Variable Length Encoding • Reverse zero coding • Based on Huffman encoding scheme • Under assumptions "exponential partition", "grid topology", "uniform distance range of queries", and c>1.5, this coding scheme is optimal • [0, T) [T, cT) [cT, c2T) [c2T, c3T) [c3T, ∞) • Average code length is approximately : Reverse coding 0000 0001 01 001 1 Fixed coding 000 001 010 011 100

  17. Signature Compression • Idea: • Many objects share the same link not compressed in memory u v If s(n)[u] + s(u)[v] = s(n)[v], then s(n)[v] can be replaced by 1-bit flag n

  18. Signature Update • Requirement • The shortest path spanning trees of all objects • A reverse index for each edge of trees that comprise this edge • limit the number of trees affected by the change of this edge • How (suppose edge (a,b) is updated) : • Find those affected spanning trees • For each affected tree of object c, check s(a)[c] or s(b)[c] (whichever is smaller) • Propagate to adjacent nodes until no more updates

  19. Roadmap • Background • Distance Signature Overview • Operations on Signatures • Query Processing on Signatures • Smart Choice of Distance Categories • Construction and Maintenance • Experimental Results • Conclusion

  20. Experiment Settings • Statistics • 183K nodes • 351K edges • Random edge weights from 1 to 10 • Page size: 4K bytes • kNN Competitors • Signature indexing • Full indexing (NN list for all nodes) • Network Voronoi Diagram (NVD) from VN3 • Tuning parameters • p: object density • T, c, k • Comparison metrics: page access (I/O cost), CPU time

  21. Index Construction Cost Good for medium and sparse datasets

  22. KNN Search Performance Moderate performance over various k

  23. Robustness The choice of parameters does not make large difference

  24. Conclusion • Our Contributions • The first index for distance computation on road networks • Speed up general query processing • Optimal choice of distance categories and category encoding • Future work • Cross-node signature compression • The signatures of nearby nodes are similar • Derivation of optimal distance categories for a wider range of network topologies and object distributions

More Related