A Dynamic Mobility Histogram Construction Method Based on Markov Chains

A Dynamic Mobility Histogram Construction Method Based on Markov Chains Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba)

Outline • Background and Objectives • Modeling Movement Patterns • Mobility Histogram: Logical Structure • Mobility Histogram: Physical Structure • Experimental Results • Conclusions

Background • Advance of GPS and communication technology enabled tracking of moving objects • Example: A taxi company in Tokyo monitor >200 taxi cabs continually • Movement data is delivered as a data stream Moving Objects Moving ObjectDatabase Data Stream Movement Data

Objectives • Construction and maintenance of a mobility histogram • Compact summary of movement data for a specific time period • Used for mobility analysis and estimation • Problems • Concrete definition of a mobility histogram • How to model movement patterns • Compact representation • Tradeoff with accuracy • Efficient construction and maintenance • Incremental processing for streamed data

Basic Idea Request for analysis / estimation Results Movement Data (as a Data Stream) Mobility Analysis /estimation Module Histogram Maintenance Module … Incremental updates Query for estimation Mobilityhistogram

Approach • 2-D movement area • Uniform cell decompositions • But allow multiple spatial granularities (e.g., 4 x 4, 16 x 16) • Movement pattern is represented as a sequence of cell numbers • Based on theMarkov chain model • Treats a movement pattern as a Markov chain sequence • Well-known model in traffic modeling

A B C Movement Patterns: Example (1) 1 Movement pattern of A 0 2  2  0  0 Movement pattern of B 3  3  1  1 2 3 Movement pattern of C 0  2  2  3

Cell partitioning with different granularities A Movement Patterns: Example (2) 0 1 4 5 2 3 6 7 Movement pattern of A 11  9  3  1 8 9 12 13 10 11 14 15

Based on Z-ordering method Simple encoding method Assign similar values to neighboring cells Translation to different granularities is easy Cell Numbering Scheme (1) 0 1 4 5 2 3 6 7 8 9 12 13 10 11 14 15

Cell Numbering Scheme (2) 0(2)0000 1(2)0001 2(2)0010 3(2)0011 Level-1 (21x21) decomposition Level-2 (22x22) decomposition

Markov Chain Model (example: order = 2) 2(1) 3(1) 1(1) 9(2) 12(2)  6(2) Step 0 Step 1 Step 2

Mobility Histogram as a Data Cube • Representing order-n Markov chain statistics as a (n +1)-d data cube Example: 1(1)1(1) 0(1)

Periodical reconstruction To cope with non-stationary movement patterns Ease of maintenance Old histograms are written to disk Histogram Maintenance Movement Data Mobility Analysis / EstimationModule Histogram Maintenance Module … Incremental updates Query for analysis … Mobilityhistogram

Mobility Histogram: Physical Structure • Problems in logical structure: huge space • 2GB (!) for a typical parameter setting • Needs multiple cubes for multiple spatial granularities • Data cubes are sparse: most of mobility patterns are hard to occur • Solution: tree-based representation • Unification of quad-tree, k-d tree, and trie • Integration of cubes in multiple granularities • Selective allocation of nodes • Saves memory space

Insertion of 3(2) 6(2) 12(2): BASE method root 11 00 01 level 1 +1 00 11 01 10 +1 00 11 01 10 +1 00 11 01 10 +1 11 00 step 0  step 1  step 2 level 2 01 Binary representation 10 +1 (=3) Step 0: 00 11 11 01 10 01 10 (=6) 00 Step 1: : visited edge : non-visited edge : counter 11 00 (=12) Step 2: x +1

Approximated Histogram (APR) • Problem of the BASE method • Memory size requirement is still high • Approximated method (APR) • Compact histogram construction by adaptive tree expansion • Allocate a buffer for each leaf node • If skew is observed, the leaf node is expanded • 2 statisticsis used to check the non-uniformity • Inherited the idea from decision tree construction from streamed data (e.g., VFDT)

Node Expansion root root 00 11 00 11 10 10 expansion 01 01 00 11 00 11 10 10 01 01 11 11 00 00 10 10 01 01 buffer internal node buffer buffer buffer buffer trans_seq[0] skew isdetected leaf node trans_seq[1] internal orleaf node … Quit expansion when no. of nodeshas reached a given constant

Buffer 4(2)12(2)6(2） 5(2)12(2) 9(2) … 7(2) 13(2) 15(2) Non-uniformity Check • Use of 2 test for goodness of fit • Null hypothesis: distribution is uniform • If 2 value > 7.815, the distribution is non-uniform at the significance level 5% Distribution ofnext steps x00 x01 x10 x11 Example: 100 sequences in the buffer 22 23 10 20 27 28 50 20 Uniform Non-uniform

1 0 2 10 1 20 4 25 Problems in Statistical Test • Problems: 2 value is not reliable • when the total number is small • when some value(s) is close to 0 • Solution: use non-parametric statistics while 2 value is not reliable • Detail is shown in the paper Total number = 1 + 2 + 1 + 4 = 8 These situations arecommon in our case

25336 11 00 13821 01 10 level = 1 00 11 4351 01 10 11 00 53 10 01 1293 11 00 538 01 10 11 00 level = 2 01 10 299 00 10 11 01 38 Use of Bitmap Cube (APR-BM) • Minor improvement to the APR method • Use a small bitmap cube in addition to a tree-structured histogram • Represent “correct” summary in some coarse level • Improvement of precision Small bitmap cube in a coarse level Accurateestimation for some queries Tree-based histogram (APR method) + Example: When partition level = 3, Markov order = 2, bitmap size = 32KB

Dataset and Environments • Experimental data • Used moving objects simulator by Brinkoff • 1024×1024 in finest granularities • 1,000 moving objects are on the map at every time instance • Environments • CPU：Pentium4 3.2GHz • Memory：1GB RAM • OS：Cygwin

Histogram Size • Settings • Data Size: 1K, 10K, 50K • Order-2 Markov transition • Results • BASE method requires huge storage Data Size Histogram Size (MB)

M = 5, BASE M = 5, APR M = 10, BASE M = 10, APR M = 5, BASE M = 5, APR M = 10, BASE M = 10, APR Construction Time • Comparison of BASE and APR • M: maximal partitioning level (granularity of input sequences) • Results • BASE has small construction cost • APR has nearly O(n2) cost due to non-uniformity check, but still has small processing cost (less than 0.15 ms per input sequence) Construction Time Construction Time per Sequence

Two types of queries Fine level: Issue queries on the most fine partitioning level (M = 10) Mixed-level: Issue queries on randomly mixed partitioning levels Results Comparison of BASE and APR No difference Quite fast Query Processing Time BASE APR BASE APR fine-level query mixed-level query

Accuracy: Histogram Plot (1) • Order-1 Markov chain histograms • Partition level = 2 BASE (“true” count) APR

Accuracy: Histogram Plot (2) Histogram Difference Diff Count = |Base count – APR count|

Precision: Evaluation Measures • Distance • Relative Error • ACTi: Actual cell value (BASE method) • ESTi: Estimated cell value (APR and APR-BM methods)

Comparison of APR and APR-BM Using “Distance” and “Relative Error” Results Similar results for Distance APR-BM is better in terms of Relative Error APR-BM can estimate small cell values accurately Evaluation of Precision Distance Relative Error

Conclusions • Mobility histogram construction method • Based on Markov chain model • Handling streamed trajectory sequences • Logical histogram: data cube • Physical histogram: tree structure (quad tree + k-d tree) • Adaptive tree growth • Approximated representation method • Use of nonparametric statistics for exceptional cases • Use of a bitmap cube to enhance precision

A Dynamic Mobility Histogram Construction Method Based on Markov Chains

A Dynamic Mobility Histogram Construction Method Based on Markov Chains

Presentation Transcript

11 - Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

Markov Chains

[Image Similarity Based on Histogram]

Eager Markov Chains

Markov chains

Markov chains

Distributed Markov Chains

Markov Chains

Markov Chains Regular Markov Chains Absorbing Markov Chains

Markov Chains

Markov Chains

Predictive Mobility Models based on K th Markov Models

Tutorial: Markov Chains

Markov Chains

Markov Chains

Markov chains