240 likes | 410 Views
People Forecasting Where people are going?. Vibhav Gogate-Computer Science Rina Dechter-Computer Science Other Collaborators: Bozhena Bidyuk-Computer Science James Marca-Transportation Science Craig Rindt-Transportation Science University of California, Irvine, CA 92967. Motivation.
E N D
People ForecastingWhere people are going? Vibhav Gogate-Computer Science Rina Dechter-Computer Science Other Collaborators: Bozhena Bidyuk-Computer Science James Marca-Transportation Science Craig Rindt-Transportation Science University of California, Irvine, CA 92967
Motivation • Origin/Destination (O-D) matrix • A necessary input to most microscopic simulation models in transportation literature. • Old method (Peeta et al. 2002) uses paper and pencil surveys to generate O-D matrices • Can IT help? • Our proposed “Activity Model” learns and predicts a user’s origins/destinations and routes using his GPS log. • Our hope is that even if a small sample of population agrees to share their GPS data, we can compute the aggregate O-D matrix for a large area like a city.
Architecture Learning Engine O-D matrix for a region Probabilistic Model GIS Database Inference Engine
Probabilistic model: Hybrid Dynamic Mixed Networks • Extends Hybrid Dynamic Bayesian Networks (Lerner 2002) to include Discrete constraints • Able to model all of the following: • Discrete, Continuous Gaussian variables • Markov processes • Deterministic (constraint networks)+probabilistic information (Bayesian Networks)
Building the activity model dt wt dt-1 wt-1 D: Time-of-day (discrete) W: Day of week (discrete) Goal: collection of locations where the person spends significant amount of time. (discrete) F: Counter to control goal switching. Route: A hidden variable that just predicts what path the person takes (discrete) Location: A pair (e,d) e is the edge on which the person is and d is the distance of the person from one of the end-points of the edge (continuous) Velocity: Continuous GPS reading: (lat,lon,spd,utc). gt-1 gt Ft-1 Ft rt-1 rt vt-1 vt lt-1 lt yt-1 yt
Constraints in the model If (distance(lt-1,gt-1)<=threshold and Ft-1=0) Then Ft=D If (distance(lt-1,gt-1)<=threshold and Ft-1>0) Then Ft=Ft-1-1 If(distance(lt-1,gt-1)>threshold and Ft-1 = 0) Then Ft=0 If(distance(lt-1,gt-1)>threshold and Ft-1 > 0) Then Ft=0 If(Ft-1>0 and Ft=0) gt is given by P(gt|gt-1) If(Ft-1=0 and Ft=0) gt is same as gt-1 If(Ft-1>0 and Ft>0) gt is same as gt-1 If(Ft-1=0 and Ft>0) gt is given by P(gt|gt-1) gt gt-1 Ft-1 Ft lt-1
Example Queries • Where the person will be 10 minutes from now? • P(lT|d1:t,w1:t,y1:t) where T=t+10 minutes • What is the person’s next goal? • P(gT|d1:t,w1:t,y1:t)
Example of Route Grocery store Route Seen Route Predicted
Contributions • A new modeling framework of “Hybrid Dynamic Mixed Networks” • A “Hybrid Dynamic Mixed Network model” for transportation routines • Predict origin/destinations and routes taken by an individual. • Novel inference algorithms for reasoning in Hybrid Dynamic Mixed Networks • An Expectation propagation based algorithm • A new algorithm that combines Particle Filtering and Generalized Belief Propagation in a systematic way.
Inference in Hybrid Dynamic Mixed Networks (HDMN) • Filtering problem: • The Belief state at time t given evidence until time t : P(Xt|e1:t) • Complexity of exact inference: NP-hard • Exponential in treewidth • Discrete Dynamic Mixed Networks • Treewidth = number of variables in each time slice • Hybrid Dynamic Mixed Networks • Treewidth = O(T) where T is the number of time-slices. • Approximation is a must in most cases!
Approximate Inference • Two popular approximate inference algorithms for Dynamic Networks • Generalized Belief Propagation (Heskes et al. ’02) • Rao-Blackwellised Particle Filtering (Doucet et al. ’02) • Our contribution: Extend these two algorithms to allow discrete constraints • Iterative Join Graph Propagation-Sequential (IJGP-S) • A new Rao-Blackwellised Particle Filtering (RBPF) algorithm called IJGP-RBPF. • Use output of IJGP to compute an importance function • Parameterized by two complexity parameters of “i” and “w” which provides us with a range of algorithms to choose from.
Experimental Results:Data Collection • GPS data was collected by one of the authors for a period of 6 months. • Latitude and longitude pairs • 3 months data was used for training and 3 months for testing. • Data divided into segments • A segment is a series of GPS readings such that two consecutive readings are less than 15 minutes apart.
Experimental Results:Models and algorithms • Test if adding new variables improves prediction accuracy. • Model-1: Model as described before • Model-2: Remove variables dt and wt • Model-3: Remove variables dt, wt,ft,rt,gt from each time slice. • Algorithms: • IJGP-RBPF(1,2), IJGP-RBPF(2,1), IJGP-S(1) and IJGP-S(2)
Various Activity models dt wt dt-1 wt-1 gt-1 gt Model-1 Ft-1 Ft rt-1 rt Model-2 vt-1 vt Model-3 lt-1 lt yt-1 yt
Learning the models from data • EM algorithm used for learning the models • Takes about 3 to 5 days to learn data that is distributed over 3 months. • Since EM uses inference as a sub-step, we have 4 EM algorithms corresponding to the 4 algorithms used for inference • IJGP-RBPF(1,2), IJGP-RBPF(2,1), IJGP-S(1) and IJGP-S(2)
Predicting Goals (MODEL-1) • Compute P(gt|e1:t) and compare it with the actual goal. • Accuracy = percentage of goals predicted correctly. • N = number of particles • Column: learning algorithm • Row: inference algorithm
Predicting Goals (Model-2) • Compute P(gt|e1:t) and compare it with the actual goal. • Accuracy = percentage of goals predicted correctly. • N = number of particles • Column: learning algorithm • Row: inference algorithm
Predicting Goals (Model-3) • Compute P(gt|e1:t) and compare it with the actual goal. • Accuracy = percentage of goals predicted correctly. • N = number of particles • Column: learning algorithm • Row: inference algorithm
Predicting Routes • Compare the path of the person predicted by the model with the actual path. • False positives (FP)---Precision • count the number of roads that were not taken by the person but were in the predicted path. • False Negatives (FN)---Recall • count the number of roads that were taken by the person but were not in the predicted path.
False Positives and False Negatives for Route prediction Model-1 shows the highest route prediction accuracy, given by low false positives and false negatives.
Future Work: O-D estimation through Simulation • Randomly generate regions and a population • Land-use structures through Microsoft Map-point • Assume an activity model per individual • Simulation gives an aggregate O-D matrix (called actual O-D) • Take a random sample of the population and use their GPS data • Our proposed System would predict an O-D matrix (predicted O-D) • Success: Distance between predicted and accurate O-D matrix.
Challenges • Scalable algorithms • Our algorithms take about 3-5 days/individual! • Does the proposed simulation represent real-world? • A model for data aggregation • Inference and Learning for other continuous frameworks • Poisson distribution. • Discrete children of continuous parents.