1 / 30

Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases

Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases. Jimeng Sun, Dimitris Papadias, Yufei Tao, Bin Liu. Motivation. Spatio-temporal databases vs. Data streams The monitoring applications Traffic supervision Mobile users monitoring

kaethe
Download Presentation

Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximate querying about the Past, the Present, and the Futurein Spatio-Temporal Databases Jimeng Sun, Dimitris Papadias, Yufei Tao, Bin Liu

  2. Motivation • Spatio-temporal databases vs. Data streams • The monitoring applications • Traffic supervision • Mobile users monitoring • Weather forecasting • Example: • find the number of vehicles in the city center now • The challenge is to provide fast query responsein highly intensive environment

  3. Problems and methods • Problems: • How to efficiently store/summarize the spatio-temporal information? • How to approximately answer the query about the past, the present, and the future? • Methods: • Adaptive multi-dimensional histogram (AMH) • Historical synopsis • Stochastic prediction method

  4. Related work • Histograms • Static multi-dimensional histograms • Equi-depth, Mhist, Minskew, Genhist, SQ • Query-adaptive multi-dimensional histograms • STGrid, STHoles, SASH • Other approximation methods • DCT, Wavelet, Sketch • Spatio-temporal databases • Historical retrieval • Future prediction

  5. Outline • Introduction • Problem and proposed methods • Adaptive multi-dimensional histogram • Historical synopsis • Prediction model • Experiment • Conclusion

  6. Query types Queries location Present Time (PT) Historical Time (HT) Future Time (FT) time past current future

  7. System Overview Historical Synopsis AMH Queries Spatio-temporal updates PT Past Index HT FT Prediction Model

  8. Histogram • Partition the space into buckets • Data within a bucket summarize by the mean • The properties of a good histogram: • Uniformity within each bucket • Incremental updateable bad good

  9. Regular cells n1 1 1 3 3 3 5 2 1 3 6 4 4 n3 n2 1 1 5 3 4 5 n4 b5 4 5 4 6 5 5 5 6 10 9 9 4 b1 b2 Buckets n5 b6 5 6 1 1 1 1 BPT b1 b3 b6 b3 b4 b2 b4 b5 Adaptive Multi-dimensional Histogram (AMH) • Objective: minimize WVS=(areai∙vari) (Minskew [Acharya, Poosala, Ramaswamy 99])

  10. Dynamic Maintenance of AMH • Our scheme: record the information during the construction and modify the structure as needed. • 1. information update • Update the bucket count • 2. bucket reorganization • Merge: to claim buckets • Split: to reduce WVS

  11. Information update of AMH Buckets n1 n1 b1 b1 b3 b6 n3 n2 n2 mapping n4 b5 b2 b1 b1 b2 n5 b6 b4 BPT b5 b3 b4

  12. Bucket reorganization -Merge • Merge the subtree that leads to minimal WVS increase BPT n1 n3 n2 n1 n1 b* b5 b1 b2 n3 n3 n2 n2 Buckets n4 n4 n4 b5 b5 b1 b* Merge b1 b1 b2 b2 n5 n5 b6 b6 BPT b3 b3 b4 b4 Bucket Info: 1. region [x-, x+][y-,y+] 2. frequency: count/area 3. 2nd moment: (for variance calculation) b2 b5

  13. n5 b*3 b*4 Bucket reorganization -Split • Split the bucket that leads to maximal WVS decrease n1 n1 n3 n2 Split n3 b* b5 n2 n4 b2 b* b5 b*1 b*2 b1 b2

  14. Features of AMH • Bucket information is updated as new data arrive • Bucket extents continuously adapt the data distribution changes • The maintenance does not affect the normal query processing • It is interruptible at any moment of time • It is performed at the CPU idle time

  15. Outline • Introduction • Problem and proposed methods • Adaptive multi-dimensional histogram • Historical synopsis • Prediction model • Experiment • Conclusion

  16. Historical Synopsis • AMH maintains the current buckets. • Past index stores the obsolete buckets. • Past index: • Packed B-tree • 3D R-tree

  17. Prediction Model • Prediction based on velocity doesn’t work! • It is not realistic to assume velocity remains constant between current time and query time • Velocity is highly dynamic • We suggest to use only the past and present location information to do prediction.

  18. Historical Synopsis Prediction Model (cont.) FT PT Parse Prediction Model HT results forecast the future using any time series prediction method: we use AR

  19. Outline • Introduction • Related work • Problem and proposed methods • Adaptive multi-dimensional histogram • Historical synopsis • Prediction model • Experiment • Conclusion

  20. Experiment settings • Datasets • 2.5M updates for each dataset • spatial: 50K mobile objects from 2 spatial dataset • road: from a spatio-temporal generator (described in [Brinkhoff 2002] ) initial final median Road network Data distribution

  21. Robustness with time Query: qlength = 6% of the data space; 25K queries uniformly distribute along space and time spatial road

  22. Minskew (a static spatial histogram) is rebuilt every 50k location updates tp is the proportion between the cost of AMH and that of Minskew The re-organization operations of AMH are uniformly distributed among the 50k location updates. Comparison with conventional histogram minskew spatial AMH minskew road AMH

  23. B-tree performs better at the high update rate. R-tree provides much faster query response. In general, when query/update ratio is large (>30%), R-tree performs better. The effect of update intensity road spatial Query type b-tree 3D r-tree

  24. Conclusion • We present a comprehensive approach for processing queries that refer to any time in history. • The proposed architecture maintains • an incremental multi-dimensional histogram; • a past index structure for storing the outdated buckets. • Future queries are answered by a stochastic method that uses the recent history to predict the future.

  25. Q+A

  26. Summary Historical Synopsis AMH 0. goal: min(WVS) 1. Info update 2. Reorganization happens when CPU is idle Prediction Model Old buckets Forecast based on the present and past. Past Index 1.Recent buckets in memory 2.Old buckets dump to the disk

  27. Related work • Static multi-dimensional histograms • Query-adaptive multi-dimensional histograms • Other multi-dimensional approximation methods • Spatio-temporal prediction methods • Spatio-temporal aggregation methods

  28. Evaluation over different query types spatial road

  29. Motivation (cont.) • Spatio-temporal database (STDB) research: • historical retrieval • future prediction

  30. n5 b*3 b*4 Bucket reorganization -Split n1 n1 n3 n2 n3 b* b5 n2 n4 b2 b* b5 b*1 b*2 b1 b2 Buckets Split Buckets b*1 b*2 b*3 b* b1 b* b2 b2 b*4 b5 b5

More Related