560 likes | 1.01k Views
TPR-trees. Indexing the Positions of Continuously Moving Objects. CMSC5705 Paper Presentation.
E N D
TPR-trees Indexing the Positions of Continuously Moving Objects March Cheung, Phoenix Wan, Alan Yuen, James Wong, Keith Tseung CMSC5705 Paper Presentation
During the lecturer, we have learned that R-tree is heuristic data structure which enables us to solve multi-dimensional range search query with good efficiency in practice. However, ordinary R-tree has its limitations. One problem is that, R-tree supports only static data. It is inapplicable to scenarios where data points (objects) are moving continuously. In this presentation, an extension of R-tree, which called time-parameterized R-tree (TPR-tree) will be discussed. TPR-tree supports the efficient querying of the current and projected future positions of moving data point (object) in one-, two-, and three-dimensional space. CMSC5705 Paper Presentation
GPS & Wireless Communication technologies CMSC5705 Paper Presentation
Stock Price Movement CMSC5705 Paper Presentation
Outline CMSC5705 Paper Presentation
Example: 2-dContinuously Moving Data Set Problem Definition CMSC5705 Paper Presentation
Example: 2-dContinuously Moving Data Set (cont.) Problem Definition t = 0 t = 1 t = 2 CMSC5705 Paper Presentation
Problem: Multi-dimensionalTime Slice Query Problem: Multi-dimensionalMoving Query Problem: Multi-dimensionalWindow Query Let P be a set of d-dimensional continuously moving points in Rd . Given a rectangle r Rd and a time point t, a time slice query returns all the points of P that fall in rat time t, where d = 1, 2, or 3. Let P be a set of d-dimensional continuously moving points in Rd . Given a rectangle r1, r2 Rd and a time span [t1,t2], a (d+1) dimensional trapezoid is obtained by connection r1at time t1 to r2at time t2, and a moving query returns all points of P that fall in that trapezoid, where d = 1,2, or 3. A window query is a special type of moving query with r1 = r2. Problem Definition CMSC5705 Paper Presentation
Example: Time Slice Query Problem Definition Qtimeslice: r = {(2,6),(2,10),(6,6),(6,10)}, t = 1 t = 0 t = 1 t = 2 The result is {p1, p2, p5} CMSC5705 Paper Presentation
Example: Window Query Problem Definition Qtimeslice: r = {(2,6),(2,10),(6,6),(6,10)}, t1 = 1, t2 = 2 t = 0 t = 1 t = 2 The result is {p1, p2, p5} {p1, p2, p4, p5} = {p1, p2, p4, p5} CMSC5705 Paper Presentation
Example: Moving Query Problem Definition Qtimeslice: r1= {(0,0),(0,4),(4,0),(4,4)}, r2= {(2,2),(2,6),(6,2),(6,8)}, t1 = 0, t2 = 2 t = 0 t = 1 t = 2 The result is {} {p3} {p3, p4} = {p3, p4} CMSC5705 Paper Presentation
Example: 1-d data space Problem Definition CMSC5705 Paper Presentation
Definition: d-dimensional continuously moving data point/object • An object’s position at some time t is given by x(t) = (x1(t), x2(t), …, xd(t)), • where it is assumed that the times tare not before the current time. This • position is modeled as a linear function of time: • x(t) = x(tref) + v(t – tref) • x(tref) is a reference position observed at time tref. • v is a velocity vector for the object • This linear function of a moving object could be changed from time to time Problem Definition In order to explain the of TPR-tree, let us make an (unrealistic) assumption, and we will come back to address this assumption later on: Assume that all data points/objects are moving statically in the Rd space i.e., the linear function of the object never change CMSC5705 Paper Presentation
TPR-tree • TPR-tree is a balanced, multi-way tree with the structure of an R-tree. It follows all the conventions in a R-tree. For example • Each leaf node has between 0.4B and B points, where B is a parameter at least 3. In case the leaf is the root, it can have any number of points • Each internal node has between 0.4B and B child nodes. In case the node is the root, it should have at least 2 child nodes • A leaf node stores the position of a moving point in the form of its linear function. In practice, the parameters of the linear function in each dimensions (xd(tref), vd), are being stored. • An internal node stores a rectangle that bounds the positions of all moving points or other bounding rectangles. CMSC5705 Paper Presentation
Example: Modeling moving points as leaf nodes of TPR-tree TPR-tree x = 2 + t y = 10 – t x = 4 + t y = 8 x = 1 + t y = 9 – t x = 6 + t y = 7 x = 3 y = 3 + t x = 8 – t y = 2 How to model the internal nodes? x = 10 – 2t y = 1 x = 5 y = 2 + 2t 8,-1 2,0 3,0 3,1 1,1 9,-1 4,1 8,0 6,1 7,0 10,-2 1,0 2,1 10,-1 5,0 2,2 CMSC5705 Paper Presentation
Example: Modeling the internal nodes of TPR-tree TPR-tree What if we create MBRs like ordinary R-tree at t = 0? t = 0 t = 1 t = 2 CMSC5705 Paper Presentation
Definition: Time-parameterized bounding rectangles in d-dimensional space • For each dimension d,the bounding interval is specified by 2 linear function of • time [xd├ (tref) + vd├ (t – tref), xd ┤ (tref) + vd ┤ (t – tref)], where • xd├ = xd├ (tref) = mini{oi.xd├ (tref)} • xd ┤= xd ┤(tref ) = maxi{oi.xd ┤(tref)} • vd├ = mini{oi. vd├} • vd ┤= maxi{oi. vd┤} • tref is the time when the bounding rectangle is created TPR-tree Time-parameterized bounding rectangle • A time-parameterized bounding rectangles bounds the enclosed points or rectangles at all times not earlier than the current time. Therefore it is time-parameterized to be capable in bounding continuously moving points. CMSC5705 Paper Presentation
Example: Modeling moving points as internal nodes of TPR-tree TPR-tree Time-parameterized bounding rectangle x = 2 + t y = 10 – t x = 4 + t y = 8 x = 1 + t y = 9 – t x = 6 + t y = 7 1,1 6,1 7,-1 10,0 3,-2 10,-1 1,0 3,2 x = 3 y = 3 + t x = 8 – t y = 2 x├= 1, vx├= 1 x ┤= 2, vx ┤= 1 y├= 9, vy├= -1 y ┤= 10, vy ┤= -1 4,1 6,1 7,0 8,0 3,0 5,0 2,1 3,2 8, -2 10, -1 1,0 2,0 x = 10 – 2t y = 1 x = 5 y = 2 + 2t 8,-1 2,0 1,1 9,-1 3,0 3,1 4,1 8,0 6,1 7,0 5,0 2,2 10,-2 1,0 2,1 10,-1 CMSC5705 Paper Presentation
Example: Modeling moving points as internal nodes of TPR-tree (cont.) TPR-tree t = 0 t = 1 t = 2 CMSC5705 Paper Presentation
Definition: Updating a time-parameterized bounding rectangles • For each dimension d,the end points of a bounding interval could be updated • as follow: • xd├ = mini{oi.xd├ (tupd)} – vd├ (tupd – tref) • xd┤= maxi{oi.xd ┤ (tupd)} – vd ┤(tupd – tref) • tref is the time when the bounding rectangle is created • tupd is the time when the update operation is performed TPR-tree Time-parameterized bounding rectangle • As the time-parameterized bounding rectangle never shrink, query performance deteriorate with time. One can update the bounding rectangle to from time to time to improve the situation CMSC5705 Paper Presentation
Example: Updating a time-parameterized bounding interval TPR-tree Time-parameterized bounding rectangle Creation-Time (Bold) and Update-Time (Dashed) bounding interval for Four moving points CMSC5705 Paper Presentation
TPR-tree Solving a time slice query in TPR-tree • Answering a time slice query in TPR-tree is in fact no different with answering a range search in traditional R-tree. TPR-tree stores the linear function for both the moving points and time-parameterized bounding rectangles. Therefore, a snapshot of the TPR-tree could be obtained by supplying a time parameter tq. Issuing range search query in the snapshot is exactly the time slice query. how? CMSC5705 Paper Presentation
Definition: Determining if a bounding rectangle is intersect with a query at tq • For each dimension d, a bounding interval (xd├, xd┤, vd├, vd ┤) intersects a • query interval ([ad├, ad┤] r, tq) if and only if • ad├ ⩽ xd┤ + vd ┤(tq – tref ) ∧ ad┤ ⩾ xd├ + vd├ (tq – tref ) • tref is the time when the bounding rectangle is created • tq is the time specify in the query TPR-tree Solving a time slice query in TPR-tree ad├ ad┤ ad├ ad┤ ad├ ad┤ xd├ xd┤ CMSC5705 Paper Presentation
Example: Solving a time slice query in TPR-tree TPR-tree Solving a time slice query in TPR-tree Nodes u1, u2, u3, u5, u6 are accessed to answer the time slice query with the shaded region as query range with t = 1 u1 u2 u3 p1 p5 p7 p3 e6 e2 e4 e3 p2 p6 e5 p4 e7 p8 u4 u5 u6 u7 CMSC5705 Paper Presentation
TPR-tree Solving a moving query in TPR-tree • For answering a moving query in d-dimensional data set, we need to consider (d+1)-dimensional space, with time as one of the axis. Other than just reporting if a bounding rectangle intersects with the query-range, the algorithm should be able to tell the time range when they are intersected. CMSC5705 Paper Presentation
Example: Intersection of a bounding interval and a query TPR-tree Solving a moving query in TPR-tree For each dimension d, we derive from r ├, r ┤and model the query intervals a function of time: [ad├ (t├) + wd├ (t – t├), ad ┤ (t├) + wd ┤(t – t├)] [xd├ , xd ┤] bounding-rectangle(v) How to obtain the value [tsub├, tsub┤] ? CMSC5705 Paper Presentation
TPR-tree Solving a moving query in TPR-tree • The steps to obtain [tsub├, tsub┤] will be illustrated, see the appendix, for detailed formulas • A bounding interval and a query are disjointed in the following scenarios, i.e., [tsub├, tsub┤] = Ø CMSC5705 Paper Presentation
TPR-tree Solving a moving query in TPR-tree • In the following 2 scenarios, tsub├ could be obtained where the boundary are intersected • Lower bound of the query intersect with upper bound of the bounding interval • Upper bound of the query intersect with lower bound of the bounding interval ad├ (t├) + wd├ (t – t├) xd ┤ (t├) + vd ┤ (t – t├) Solving t such that, ad├ (t├) + wd├ (t – t├),= xd ┤ (t├) + vd ┤ (t – t├) CMSC5705 Paper Presentation
TPR-tree Solving a moving query in TPR-tree • Similarly, in the following 2 scenarios, tsub┤could be obtained where the boundary are intersected • Lower bound of the query intersect with upper bound of the bounding interval • Upper bound of the query intersect with lower bound of the bounding interval CMSC5705 Paper Presentation
TPR-tree Solving a moving query in TPR-tree • The above 4 scenarios could be combined, like the example on slide p.26. • For all other scenarios, a bounding interval and a query intersect for the whole time span, i.e., [tsub├, tsub┤] = [t├, t ┤]. For example CMSC5705 Paper Presentation
Example: Data points/objects that move dynamically Insertion, Updating and Deletion • Recalled that we have made an assumption that all data points/objects are moving statically. However, this is obviously unrealistic. Consider the example on slide p.12 CMSC5705 Paper Presentation
Insertion, Updating and Deletion • The linear function of a data point is an prediction of the movement. Once the moving path changes, we need to update the corresponding leaf node that stores the data point. • Updating the linear function of a leaf node may cause the bounding rectangle invalid. In such situation index re-building is required. • The TPR-tree data structure has a life span, which is called Time Horizon (H). The data structure has to be rebuild every time period of H. • This introduce a need of insertion algorithm in order to build the tree. • When new data point comes in, we can use the same algorithm to insert it into the existing tree dynamically. CMSC5705 Paper Presentation
Insertion, Updating and Deletion Insertion Algorithm • Similar to range query, we can reuse the insertion algorithm of an ordinary R-tree. There are 2 exceptions • Even if there are no overflow, the parent node of the target leaf node may still need to be updated • The choose-subtree/split algorithm to select the target bounding rectangle CMSC5705 Paper Presentation
Insertion, Updating and Deletion Insertion Algorithm CMSC5705 Paper Presentation
Insertion, Updating and Deletion Insertion Algorithm • In ordinary R-tree, we choose the MBR which requires minimum increase in perimeter. In TPR-tree, time-parameterized bounding rectangle grows with time. • It is desirable that the bounding rectangles are as small as possible at all times in [tref, tref + H]. i.e, keeping growth rates minimal. • To take time and growth rate into consideration, we need to take integral of the objective function CMSC5705 Paper Presentation
Insertion, Updating and Deletion Insertion Algorithm • The objective function A(t) could be • area of the bounding rectangle • perimeter of the bounding rectangle • overlapping among bounding rectangle • For example, if A(t) is area, the integral computes the area(volume) of the trapezoid in (x, t)-space CMSC5705 Paper Presentation
Insertion, Updating and Deletion Final words on Splitting Nodes and deletion • Splitting Nodes • R-tree approach: In some time limit, split the node according to the best partitioning found, i.e., the one that minimizes the sum of the objective functions • R*-tree approach: Perform sorting on the data points and as reference to split • Sort the data point base on position: • Create-time position • Update-time position • Sort the data base on velocity vector • Points with similar velocity will be grouped together • Deleting Nodes • R*-tree approach is applied. When a node is underflow, the node will be destruct and remaining points will be inserted to the tree • The resulting bounding rectangle of merge (R-tree approach) is unlikely to have good performance for continuously moving point. CMSC5705 Paper Presentation
Example: Data points/objects that move dynamically Quality and Performance Study • Consider again the example that data points/objects that move dynamically. It is important to introduce the concept of iss(Q), the time when a query Q is issued. Consider Q1, If iss(Q1) < 1, result = {o1} If iss(Q1) ⩾ 1, result = {} The index re-building rate (i.e, time horizon) affect the quality (accuracy) of the data structure CMSC5705 Paper Presentation
Quality and Performance Study • Quality and Performance Parameter • Querying window (W) • How far queries can “look” into the future. • Quality parameter • Index usage time (U) • The time interval during which an index will be used for querying • Quality and performance parameter • Time Horizon (H) • The life span of the data structure. Which equals to index usage time plus the querying window CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Denote UI is be the update rate of an object movement in the experiment, ND is the number movement destination distributed in the space. Below is the setting of the experiment: CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Search performance for UI = 60 and varying setting of H Foundlings • The best values of H lie between UI/2 + W and UI + W. CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Decreasing the number of destinations adds skew to the distribution of the object positions and their velocity vectors Foundlings • Increased skew leads to a decrease in the numbers of I/Os for all three approaches especially for the TPR-tree • Not good in uniform destination CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Search performance for varying W Foundlings • Given the skewed data, different value of W will have the relatively constant performance of the TPR-tree CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Decreasing the number of destinations adds skew to the distribution of the object positions and their velocity vectors Foundlings • Increased skew leads to a decrease in the numbers of I/Os for all three approaches especially for the TPR-tree • Not good in uniform destination CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Search performance for varying query sizes and 3-dimensional data Foundlings • the increased dimensionality of the data adversely affects performance however, much better than R-tree CMSC5705 Paper Presentation
Quality and Performance Study Empirical Study • Search performance for varying number of objects Foundlings • Shows the scalability of the TPR-tree, increase the number of object while keeping the density approximately the same CMSC5705 Paper Presentation
Thank You ! Question and Answer Session CMSC5705 Paper Presentation