250 likes | 389 Views
Computer Engineering and Informatics Department, Polytechnic School, University of Patras. Informatics Department, Ionian University. Informatics Department, Aristotle University of Thessaloniki. Indexing Mobile Objects on the plane Revisited. S.Sioutas, K.Tsakalidis, K.Tsihlas, C. Makris
E N D
Computer Engineering and Informatics Department, Polytechnic School, University of Patras Informatics Department, Ionian University Informatics Department, Aristotle University of Thessaloniki Indexing Mobile Objects on the plane Revisited S.Sioutas, K.Tsakalidis, K.Tsihlas, C. Makris & Y. Manolopoulos The authors would like to thank the Greek-Bulgarian Bilateral Scientific Protocol for funding the above work. Data Engineering Lab
Definition of Problem -Literature Survey Problem: Report the mobile objects located inside the rectangle [x1q ,x2q] [y1q ,y2q ] at the time instants between t1q and t2q (where tnowt1qt2q) given the current motion information of all objects Literature Survey - Methods • Geometric Duality Transformation and B+ Trees / Partition Trees • TPR*-trees • STRIPES
Problem Description • Velocities are bounded by [umin, umax] • Objects update their motion information, when their speed or direction changes. The system is dynamic, i.e. objects may be deleted or new objects may be inserted • Let P(t0)=[x0,y0] be the initial position at time t0. Then, the object starts moving and at time t>t0 its position will be P(t)=[x(t), y(t)]=[x0+ux(t-t0), y0+uy(t-t0)] • U=[ux,uy] is its velocity vector • The lines in figure below depict the objects’ trajectories on the (t,y)plane
Indexing Mobile Objects in one dimension Hough – X dual Transformation • It maps the line with equation y(t)=ut+a to the dual point (u,a) point in R2 • Accordingly, the 1-d query [(t1q,t2q), [(y1q,y2q)] becomes a polygon in the dual space (see figure above) • Thus the initial query [(t1q,t2q), [(y1q,y2q)] in (t,y) plane can be transformed to the following one query in (u.a) plane:
Indexing Mobile Objects in one dimension Hough – Y dual Transformation • By rewriting the equation y=ut+a as • The point in the dual plane has coordinates (b,n) where and • Thus the initial query [(t1q,t2q), [(y1q,y2q)] in (t,y) plane can be transformed to the following one query in (b,n) plane:
CRITERION Hough Dual Transformations • Motions with small velocities in the Hough-Y approach are mapped into dual points (b,n) having large n coordinates (n=1/u) • By storing the Hough-Y dual points in an index structure such as an R* -tree, MBR's with large extents are introduced, and the performance is severely affected. • By using a Hough-X for the small velocities' partition, this effect is eliminated • The query area in Hough-X plane is enlarged by the area • E Hough-X =E1 hough-X + E2 hough-X • and in Hough-Y plane by E Hough-Y =E1 hough-Y + E2 hough-Y • Q Hough-X = actual area of the simplex query in Hough-X plane • QHough-Y = actual area of the simplex query in Hough-Y plane • Thus, the overall solution proposes the choice of that transformation which minimizes the following criterion:
The procedure for building the index • Decompose the 2-d motion into two 1-d motions on the (t,x) and (t,y) planes. • For each projection, build the corresponding index structure. Partition the objects according to their velocity: • Objects with small velocity are stored using the Hough-X dual transform, while the rest are stored using the Hough-Y dual transform. • Motion information about the other projection is also included.
Algorithm for answering the exact 2-d query • Decompose the query into two 1-d queries, for the (t,x) and (t,y) projection (2) For each projection get the dual - simplex query (3) For each projection calculate the criterion c and choose the one (say p) that minimizes it (4) Search in projection p the Hough-X or Hough-Y partition (5) Perform a refinement or filtering step ``on the fly", by using the whole motion information. Thus, the result set contains only the objects that satisfy the query
INNOVATION • Q Hough-X is computed by querying a 2-d partition tree • Q Hough-Y is computed by querying a B+ tree that indexes the b parameters Our construction instead is based: • (a) on the use of the Lazy B-tree [ISAAC 05] instead of the B + tree when handling queries with the Hough-Y transform and • (b) on the employment of a new index that outperforms partition trees in handling polygon queries with the Hough-X transform.
1st solution: Handling polygon queries when using the Hough-Y transform with method of LBT’s Theorem: The Lazy B-Tree [sioutas et.al, ISAAC 05] supports the search operation in O(logBn) worst-case block transfers and update operations in O(1) worst-case block transfers, provided that the update position is given 1st level= B-tree 2nd level=buckets of size O(log2n). Each bucket consists of two list layers, L and Li respectively, where 1iO(log n), each of which has O(log n) size Each bucket is assigned a criticality indicating how close this bucket is to be fused or split. Every O(logBn) updates we choose the bucket with the largest criticality and make a rebalancing operation (fusion or split) The update of the Lazy B-tree is performed incrementally (i.e., in a step-by-step manner) during the next O(logBn) update operations and until the next rebalancing operation. The global rebalancing lemma ensures that the size of the buckets will never be larger than O(log2n).
1st Solution:Method of LBT’s“Two Lazy B-trees for indexing the b parameters of each dimension” • Optimal Update Performance • Indexing of b parameters in O(logBn) I/O’s in each dimension • Combination of the results produced in each dimension and Filtering • Indexing Performance depends on area of spatial query rectangle • For sensibly realistic levels of query rectangles Very good time performance
2nd solution: Handling polygon queries when using the Hough-X transform • Crucial observation: The query polygon has the nice property of being divided into orthogonal objects, i.e. orthogonal triangles or rectangles, since the lines X=Umin and X=Umax are parallel. • Case I:
2nd solution: Handling polygon queries when using the Hough-X transform • Case III • Case II
2nd solution: Handling polygon queries when using the Hough-X transform • The problem of handling orthogonal range search queries has been handled in PODS 99 [Arge, Samoladas,Viter 99], where an optimal solution was presented to handle general (4-sided) range queries in O((N/B)(log(N/B))loglogBN) disk blocks and could answer queries in O(logBN+T/B) I/O's, the structure also supports updates in O((logBN)(log(N/B))/loglogBN) I/O's. • Let us now consider the problem of devising an access method for handling orthogonal triangle range queries; in this problem we have to determine all the points from a set S of n points on the plane lying inside an orthogonal triangle • Let T be an orthogonal triangle defined by the point (xq,yq) and the line Lq that is not axis-parallel
A new 3-layered Access Method for Triangle Range Queries • (1st layer): We sort the n points according to their x-coordinates and store the ordered sequence in a leaf-oriented balanced binary search tree of depth O(log n). This structure answers the query: “determine the points having x-coordinates in the range [x1,x2] by traversing the two paths to the leaves corresponding to x1,x2”. The points stored as leaves at the subtrees of the nodes which lie between the two paths are exactly these points in the range [x1,x2]. • (2nd layer): For each subtree, the points stored at its leaves are organized further to a second level structure according to their y-coordinates in the same way. • (3rd layer): For each subtree of the second level, the points stored at its leaves are organized further to a third level structure (Chazelle et.al [CGL83] in main memory or Arge et.al [AAEFV00] in external memory) for half-plane range queries.
Algorithm for Orthogonal Triangle Range Query • In the tree storing the pointset S according to x-coordinates, traverse the path to xq. All the points having x-coordinate in the range [xq,) are stored at the subtrees on the nodes that are right sons of a node of the search path and do not belong to the path. There are at most O(log n) such disjoint subtrees. • For every such subtree traverse the path to yq. By a similar argument as in the previous step, at most O(logn) disjoint subtrees are located, storing points that have y-coordinate in the range [yq, ). • For each subtree in Step 2, apply the half-plane range query of Chazelle or Arge to retrieve the points that lie on the side of line Lq towards the triangle. • The correctness of the above algorithm follows from the structure used. • In each of the first two steps we have to visit O(logn) subtrees. If in step 3 we apply the main memory solution of [CGL83], then the query time becomes O(log3n+A), whereas the required space is O(nlog2n). • Otherwise, if we apply the external memory solution of [AAEFV00], then our method above requires O(log2nlogBn +A) I/O's and O(nlog2n) disk blocks. • Although the space becomes superlinear the O(log2nlogBn +A) worst-case I/O complexity of our method is better than the O((n/B)+A/B)) worst-case I/O complexity of a partition tree.
Experimental Evaluation of LBT’s method vs B+ trees method and TPR* tree: Query Cost Comparison • qv len =5, qT len =50, qR len=100 • qv len =5, qT len =50, qR len=1000 • LA [Tigger] real spatial dataset • For simplicity, all objects are stored using the Hough-Y dual transform. This assumption is also realistic, since in practice the number of mobile objects, which are moving with very small velocities, is negligible. • Each query q has 3 parameters: qRlen, qVlen, and qTlen, such that (a) its MBR qR is a square, with length qRlen, uniformly generated in the data space, (b) its VBR is qV={-qVlen/2, qVlen/2, -qVlen/2, qVlen/2}, and (c) its query interval is qT= [0,qTlen]
Experimental Evaluation of LBT’s method vs B+ trees method and TPR* tree: Query Cost Comparison • When the length of the query rectangle becomes extremely large, f.e. 2000, meaning 400 hectares of query's surface, our method degrades. • While the surface of the query rectangle grows, the answer's size in each projection may grow too, thus the performance of LBT's method that combines and filters the two answers may degrade. • In real GIS applications, for a vast spatial terrain of 106 hectares, f.e. the road network of a big town where each road square covers no more than 1 hectare (or 10.000 m2) the most frequent queries consider spatial query's surface no more than 100 road squares (or 100 hectares) and future time interval no larger than 100 seconds. This is what we later say sensibly realistic levels. • qv len =5, qT len =50, qR len=2000
Experimental Evaluation of LBT’s method vs B+ trees method and TPR* tree: Query Cost Comparison • qv len =10, qT len =50, qR len=400 • qv len =10, qT len =50, qR len=1000 • Figures depict the efficiency of our solution in case the velocity vector grows up • Obviously, the velocity factor is very important for TPR-like solutions, but it isn't for the other methods, especially this one of LBTs, which depends exclusively on query's surface factor.
Experimental Evaluation of LBT’s method vs B+ trees method and TPR* tree: Query Cost Comparison • qv len =5, qT len =1, qR len=400 • qv len =5, qT len =1, qR len=1000 Figures depict the efficiency of our solution in case the length of time interval extremely degrades to value 1
Experimental Evaluation of LBT’s method vs B+ trees method and TPR* tree: Query Cost Comparison • qv len =5, qT len =100, qR len=400 Figure depicts the efficiency of our solution in case the length of time interval enlarges to value 100
Experimental Evaluation of LBT’s method vs B+ trees method and TPR* tree: Update Cost Comparison • LBT’s require a constant number of 6 block transfers (3 block transfers for each projection, for details see sioutas et.al [ISAAC 05]) and this update performance is independent on size of dataset. • In other 2 solutions the update performance is not constant and is depend on size of dataset even if in the experiment of figure above B+trees seem to touch the optimal performance of LBT's requiring 8 block transfers respectively (TPR* tree requires 35 block transfers in average).
Experimental Evaluation of LBT’s method vs B+ trees method, TPR* tree: Update Cost Comparison • According to theory, the solution of LBT's outperform the update performance of B+ trees by a logarithmic factor but this is not depicted clearly in previous Figures due to small datasets. • For this reason we performed another experiment with gigantic synthetic data sets of size n0[106 , 1012] (see the figure above)
CONCLUSIONS • We presented access methods for indexing mobile objects that move on the plane to efficiently answer range queries about their location in the future • Concerning the update performance evaluation our 1st solution is the most efficient (optimal) • The query performance evaluation illustrates the applicability of our 1st solution in case the length of the query rectangle remain in sensibly realistic levels. • Finally, the 2nd very efficient solution is somehow complicated and thus it has only theoretical interest • Future plan: (1) Experimental Comparison with STRIPES (it was already done in Journal Version and the results are very promising) (2) The simplification of 2nd solution in order to be more applicable in practice.