340 likes | 446 Views
Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota. Outline. Motivation Introduction Contributions STP Query Algorithms Related Work Evaluation Future Work. Motivation. Given a large collection of spatio-temporal trajectories Environment
E N D
Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota
Outline Motivation Introduction Contributions STP Query Algorithms Related Work Evaluation Future Work
Motivation Given a large collection of spatio-temporal trajectories Environment 2-dimentional space and 3rd dimension as time Find out how to evaluate Spatio-Temporal Pattern queries? Objective to achieve less computation and I/O cost
What is Spatio-Temporal Pattern Query? An STP query returns trajectories that follows movement patterns defined as spatio-temporal predicates. Some examples: “find objects that crossed through refion A at time T1, came as close as possible to point B at a later time T2 and then stopped inside circle C sometime during interval (T3, T4)” “find objects that first crossed through region A, then passed as close as possible from point B and finally stopped inside circle C”
What is trajectory? an imagined trace of positions followed by an object moving through space, in other words, the position of the object as function of time. Trajectory allows you to compute the location of an object for any given time. The model presented is independent of trajectory representation.
More on ST Pattern Queries An ST pattern query Q is expressed as: Q = {(Q1,T1), (Q2,T2), …, (Qm,Tm)} where each pair is a spatio-temporal predicate. Qk can be range or NN query. T is time constraint which can vary from a time instant (t) to time-interval (t) or empty. If T is empty, the query is classified as STP with Order. Otherwise, STP with Time.
More on ST Pattern Queries STP Query with Time “find objects that crossed through region A at time T1, came as close as possible to point B at a later time T2 and then stopped inside circle C sometime during interval (T3, T4)” STP Query with Order “find objects that first crossed through region A, then passed as close as possible from point B and finally stopped inside circle C”
Contributions Introduces and formalizes Spatio-Temporal Pattern queries Proposes query evaluation algorithms for STP queries with time and STP queries with order A novel index structure for evaluation of STP queries with order
STP Query Algorithms STP Queries with Time Range Predicate Evaluation NN Predicate Evaluation Combinations STP Queries with Order Range Predicate Evaluation NN Predicate Evaluation Combinations
STP Queries with Time Traditional spatio-temporal index structure is used to index trajectories (R-tree, MVR-tree). Trajectories are approximated using Minimum Bounding Rectangles. An MBR bounds movement of an object for a small time-interval.
Leaf nodes of ST index are populated with MBRs associated with the object. Data entries in this index points to raw trajectory data on disk. Raw trajectory data is stored on sequential pages per trajectory. Thus, MBRs are indexed using a secondary spatio-temporal index structure.
STP Query with Time, Range only Given Q = {(Q1,T1), (Q2,T2), …, (Qm,Tm)} where all Qm are range Evaluate predicates concurrently outcome: candidate trajectories Then, take intersection of these candidate trajectories Retrieve only the trajectories belonging to the intersection Visualize …
STP Query with Time, NN only Fact: An NN predicate is satisfied only with respect to other NN predicates. Not as straightforward as evaluating range queries Two approaches : Lazy , Eager Lazy : Main motivation is prepruning. Eliminate trajectories whose LBD(Q,Pj) is larger than any known UBD(Q,Pk) so that decrease number of Pk to retrieve from disk.
Lazy Strategy Q = { (NN(0), 1), (NN(0), 2), (NN(0), 3), (NN(0), 4), (NN(0), 5) } In other words, locate the object that stays closer to the origin during time interval [1,5]. First locate, 1-NN, 2-NN etc MBRs for each query predicate
More on Lazy Strategy While true concurrentBestFristSearch () Output: set S of Pj discovered during NN searches For each Pj in S if LBD(Q,Pj) > t then remove Pj from S else if Pj covers Q then PD = getRawTrajectory(Pj ) compute D(PD, Q) if D(PD, Q) < t then t = D(PD, Q) and PB = Pj else remove Pj from S For each Pj left in S if LBD(Q,Pj) < t then stop = false else remove Pj from S If stop then break End while Return PB
More on Lazy Strategy A hash-table indexed by trajectories Each entry stores current LBD(Q,Pj) and bit vector, one bit per predicate indicating if predicate qi is covered by Pj
More on Lazy Strategy Each time a new MBR is located, corresponding entry in hash table Pj is retrieved, corresponding bit is set and LBD(Q,Pj) is updated. If AND operation on bit vector is true, it means Pj covers the query. D(Q, Pj) is computed and stored as LBD(Q,Pj. For other Ps, pessimistic approximation done
More on Lazy Strategy Benefits: Incremental discovery of trajectory MBRs Prepruning before loading raw trajectory data Thus, reduces access to raw data on disk
Eager Strategy Load raw trajectory data from disk for each predicate Qi Then, compute D(Qi, Pj) Lazy or Eager? Depends on query properties and how many pages are occupied per trajectory
Combinations Satisfiability of NN predicates depends on satisfiability of range predicates. First evaluate range predicates Then evaluate NN predicates
STP Queries with Order Expressed as : Q= {Q1, Q2, Q3,…, Qm } Problem: Traditional ST indexes can not be used since time constraint is missing and relative order of predicates is of importance. A special structure storing order inside the index is needed. Solution: an index structure which maintains the order with which each Pj satisfies query predicates. For each Qi , associate an ordered list of Pj s satisfying Qi , SQi Trajectories in SQi are ordered by their ids’. Given a STP query Q, all predicates are evaluated concurrently by merge-join among all SQis. Even this solution can be improved. How?
More on Queries with Order Use space-partitioning grid Each cell acts as a range predicate. Thus, SQi is created for each cell. Predicates are evaluated by merge-join like before. Then, verification step to prune false positives.
More on Queries with Order Benefits space saving due to symbolic representation trajectories discretizes space, only a fixed number of lists are kept (the cell list can be stored on a secondary storage)
Related Work No current related work on evaluating combinations of spatial predicates with time and order No current work on efficient evaluation of combination of NN predicates “Mobility Patterns” by Mouza and Riagux , patterns are expressed as regular expressions and can not handle distance-based queries, neither explicit time constraints. Besides, this model assumes regions are predefined and divided into named zones.
Related Work – cont. Map partitioned into several zones identified with labels (a, b, ..) Ex: “Give all the objects that traveled from a to f, stayed at least 2 minutes in f then traveled from f to c.” Q= (a.f{2,}.c,0), mobility pattern as a regular expression To evaluate the pattern query, build an automaton
Experiments and Evaluations Synthetic dataset of moving object trajectories R-tree and MVR-tree index structures are used to index MBRs. On average 20 MBRs per trajectory RelevantPattern set: 100 queries formed from partial segments of trajectories already in the dataset, slightly skewed in time and space RandonPattern set: 100 queries whose predicates lie on consecutive nodes of the network All predicates on STP queries with Time are NN and with Order range. Top-20 results are returned.
Performance vs. # of Predicates, time STP queries with Time • Lazy deteriorates since many index nodes need to be accessed before a tight threshold can be computed on randompattern set • On relevantpattern set, both lazy and eager remain quite unaffected since NN discovery is fast and searches terminate faster.
Performance vs. # of Predicates, time • Randompattern, # of trajectories to load increases as # of predicates increases. • Eager loads more trajectories. • On relevantPatttern set, both Lazy and Eager load same amount of trajectories since there are many trajectories similar to the given queries. Very fast • See the differences in scale
Performance vs. # of Predicates, order STP queries with Order • Relevantpattern set • CellList reduces I/O since trajectories are pruned efficiently during joining of lists. • CellList decreases # of trajectories to load • R-tree performs very poorly since no way to prune trajectories due to missing time constraint and order information is not stored in index.
Future Work How to integrate k-NN queries into this model? How to decrease space requirements? Can we make the STP queries continuous? How to extend the techniques for STP queries with relative temporal constraints? (10 minutes before or after, etc.) How to develop solutions for environments with uncertain moving object trajectories?
Conclusion Introduced Spatio-Temporal Pattern queries for trajectories Developed algorithms for evaluation of STP queries Developed special index structure for STP queries with order Achieved its goals: reduced I/O and computation cost
Queries with Order, kNN only For each predicate Qi , there is a SQi . SQi is populated with all entries contained in cells adjacent to Q. In each phase, number of adjacent cells is increased by moving one cell further. (first cell containing Q, then adjacent 8 cells, etc) For each addition to SQi, LBD(Qi,Pj) is computed. For every new cell added in SQi , first join SQi with other SQj.