340 likes | 480 Views
A Dynamic Mobility Histogram Construction Method Based on Markov Chains. Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba). Outline. Background and Objectives Modeling Movement Patterns
E N D
A Dynamic Mobility Histogram Construction Method Based on Markov Chains Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba)
Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions
Background • Advance of GPS and communication technology enabled tracking of moving objects • Example: A taxi company in Tokyo monitor >200 taxi cabs continually • Movement data is delivered as a data stream Moving Objects Moving ObjectDatabase Data Stream Movement Data
Objectives • Construction and maintenance of a mobility histogram • Compact summary of movement data for a specific time period • Used for mobility analysis and estimation • Problems • Concrete definition of a mobility histogram • How to model movement patterns • Compact representation • Tradeoff with accuracy • Efficient construction and maintenance • Incremental processing for streamed data
Basic Idea Request for analysis / estimation Results Movement Data (as a Data Stream) Mobility Analysis /estimation Module Histogram Maintenance Module … Incremental updates Query for estimation Mobilityhistogram
Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions
Approach • 2-D movement area • Uniform cell decompositions • But allow multiple spatial granularities (e.g., 4 x 4, 16 x 16) • Movement pattern is represented as a sequence of cell numbers • Based on theMarkov chain model • Treats a movement pattern as a Markov chain sequence • Well-known model in traffic modeling
A B C Movement Patterns: Example (1) 1 Movement pattern of A 0 2 2 0 0 Movement pattern of B 3 3 1 1 2 3 Movement pattern of C 0 2 2 3
Cell partitioning with different granularities A Movement Patterns: Example (2) 0 1 4 5 2 3 6 7 Movement pattern of A 11 9 3 1 8 9 12 13 10 11 14 15
Based on Z-ordering method Simple encoding method Assign similar values to neighboring cells Translation to different granularities is easy Cell Numbering Scheme (1) 0 1 4 5 2 3 6 7 8 9 12 13 10 11 14 15
Cell Numbering Scheme (2) 0(2)0000 1(2)0001 2(2)0010 3(2)0011 Level-1 (21x21) decomposition Level-2 (22x22) decomposition
Markov Chain Model (example: order = 2) 2(1) 3(1) 1(1) 9(2) 12(2) 6(2) Step 0 Step 1 Step 2
Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions
Mobility Histogram as a Data Cube • Representing order-n Markov chain statistics as a (n +1)-d data cube Example: 1(1)1(1) 0(1)
Periodical reconstruction To cope with non-stationary movement patterns Ease of maintenance Old histograms are written to disk Histogram Maintenance Movement Data Mobility Analysis / EstimationModule Histogram Maintenance Module … Incremental updates Query for analysis … Mobilityhistogram
Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions
Mobility Histogram: Physical Structure • Problems in logical structure: huge space • 2GB (!) for a typical parameter setting • Needs multiple cubes for multiple spatial granularities • Data cubes are sparse: most of mobility patterns are hard to occur • Solution: tree-based representation • Unification of quad-tree, k-d tree, and trie • Integration of cubes in multiple granularities • Selective allocation of nodes • Saves memory space
Insertion of 3(2) 6(2) 12(2): BASE method root 11 00 01 level 1 +1 00 11 01 10 +1 00 11 01 10 +1 00 11 01 10 +1 11 00 step 0 step 1 step 2 level 2 01 Binary representation 10 +1 (=3) Step 0: 00 11 11 01 10 01 10 (=6) 00 Step 1: : visited edge : non-visited edge : counter 11 00 (=12) Step 2: x +1
Approximated Histogram (APR) • Problem of the BASE method • Memory size requirement is still high • Approximated method (APR) • Compact histogram construction by adaptive tree expansion • Allocate a buffer for each leaf node • If skew is observed, the leaf node is expanded • 2 statisticsis used to check the non-uniformity • Inherited the idea from decision tree construction from streamed data (e.g., VFDT)
Node Expansion root root 00 11 00 11 10 10 expansion 01 01 00 11 00 11 10 10 01 01 11 11 00 00 10 10 01 01 buffer internal node buffer buffer buffer buffer trans_seq[0] skew isdetected leaf node trans_seq[1] internal orleaf node … Quit expansion when no. of nodeshas reached a given constant
Buffer 4(2)12(2)6(2) 5(2)12(2) 9(2) … 7(2) 13(2) 15(2) Non-uniformity Check • Use of 2 test for goodness of fit • Null hypothesis: distribution is uniform • If 2 value > 7.815, the distribution is non-uniform at the significance level 5% Distribution ofnext steps x00 x01 x10 x11 Example: 100 sequences in the buffer 22 23 10 20 27 28 50 20 Uniform Non-uniform
1 0 2 10 1 20 4 25 Problems in Statistical Test • Problems: 2 value is not reliable • when the total number is small • when some value(s) is close to 0 • Solution: use non-parametric statistics while 2 value is not reliable • Detail is shown in the paper Total number = 1 + 2 + 1 + 4 = 8 These situations arecommon in our case
25336 11 00 13821 01 10 level = 1 00 11 4351 01 10 11 00 53 10 01 1293 11 00 538 01 10 11 00 level = 2 01 10 299 00 10 11 01 38 Use of Bitmap Cube (APR-BM) • Minor improvement to the APR method • Use a small bitmap cube in addition to a tree-structured histogram • Represent “correct” summary in some coarse level • Improvement of precision Small bitmap cube in a coarse level Accurateestimation for some queries Tree-based histogram (APR method) + Example: When partition level = 3, Markov order = 2, bitmap size = 32KB
Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions
Dataset and Environments • Experimental data • Used moving objects simulator by Brinkoff • 1024×1024 in finest granularities • 1,000 moving objects are on the map at every time instance • Environments • CPU:Pentium4 3.2GHz • Memory:1GB RAM • OS:Cygwin
Histogram Size • Settings • Data Size: 1K, 10K, 50K • Order-2 Markov transition • Results • BASE method requires huge storage Data Size Histogram Size (MB)
M = 5, BASE M = 5, APR M = 10, BASE M = 10, APR M = 5, BASE M = 5, APR M = 10, BASE M = 10, APR Construction Time • Comparison of BASE and APR • M: maximal partitioning level (granularity of input sequences) • Results • BASE has small construction cost • APR has nearly O(n2) cost due to non-uniformity check, but still has small processing cost (less than 0.15 ms per input sequence) Construction Time Construction Time per Sequence
Two types of queries Fine level: Issue queries on the most fine partitioning level (M = 10) Mixed-level: Issue queries on randomly mixed partitioning levels Results Comparison of BASE and APR No difference Quite fast Query Processing Time BASE APR BASE APR fine-level query mixed-level query
Accuracy: Histogram Plot (1) • Order-1 Markov chain histograms • Partition level = 2 BASE (“true” count) APR
Accuracy: Histogram Plot (2) Histogram Difference Diff Count = |Base count – APR count|
Precision: Evaluation Measures • Distance • Relative Error • ACTi: Actual cell value (BASE method) • ESTi: Estimated cell value (APR and APR-BM methods)
Comparison of APR and APR-BM Using “Distance” and “Relative Error” Results Similar results for Distance APR-BM is better in terms of Relative Error APR-BM can estimate small cell values accurately Evaluation of Precision Distance Relative Error
Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions
Conclusions • Mobility histogram construction method • Based on Markov chain model • Handling streamed trajectory sequences • Logical histogram: data cube • Physical histogram: tree structure (quad tree + k-d tree) • Adaptive tree growth • Approximated representation method • Use of nonparametric statistics for exceptional cases • Use of a bitmap cube to enhance precision