470 likes | 851 Views
Activity-based Modelling: An overview (and some things we have been doing to advance state-of-the-art). E. Zwerts (With the cooperation of E. Moons and D.Janssens) Transportation Research Institute Data Analysis and Modelling Group,
E N D
Activity-based Modelling:An overview (and some things we have been doing to advance state-of-the-art) E. Zwerts (With the cooperation of E. Moons and D.Janssens) Transportation Research Institute Data Analysis and Modelling Group, Faculty of Applied Economic Sciences, Limburgs Universitair Centrum, Diepenbeek, Belgium, E-mail: enid.zwerts@luc.ac.be
Outline • Why transportation modelling? • Which kinds of transportation modelling? • Why activity-based transportation modelling? • Which activity-based transportation model? • Model Selection: Albatross • What is Albatross? • Things what we have been doing and are still going to do with respect to Albatross • Introduction of an alternative modelling approach based on sequential dependencies in data (short version)
Why transportation modelling? • Transportation problem is multi-dimensional: • Traffic jams • CO2-emissions • Impact on economy • Traffic accidents with significant number of casualties in Belgium • The need for transportation infrastructure is high, due to: • Globalization • Urbanization • Governments cannot afford transportation constraints to have a negative impact on future competiteveness, foreign investments,… • However, changing the existing infrastructure is: • Expensive, have significant long-term effects • No guarantee for succes • Not trivial (existing spatial zones, restricted by local and federal regulations, legislation, etc.)
Why transportation modelling? • Therefore transportation models are often used. They can: • Support management decision making • Make predictions in uncertain circumstances: • Changing infrastructure, environment • Changing behaviour of people • Changing socio-demographic circumstances • ... • The aim for these models is to portray reality as accurate as possible • They are frequently used in different countries
Transportation modelling: Trip-based Approach Reality • Modelling as independent and isolated trips, no connections between the different trips • no time component • no direction • no sequential infomation Play Squash 12h,By foot 12.50h,By foot Work 7.30h,PT 16.40h,PT At home 22h, Car 19h, Car Family visit Trip-based model Work Squash Work By foot, 2X PT, 2X At home At home Family visit Car, 2X
Transportation modelling: Tour-based Approach Reality Play Squash 12h,By foot 12.50h,By foot Work 7.30h,PT 16.40h,PT At home 22h, Car 19h, Car Family visit Tour-based model Play Squash By foot By foot Work PT PT At home At home Car Car Family visit • Trips that start and end from home or from the same work-location are modelled independent • Direction + (spatial) limitations • No temporal dimension • Independent tours, model is not capable of making the integration • Uses Nested logit techniques
Transport modelling: An activity-based approach • Travel demand is derived from the activities that individuals need/wish to perform • Sequences or patterns of behaviour, and not individual trips are the unit of analysis • Household and other social structures influence travel and activity behaviour • Spatial, temporal, transportation and interpersonal interdependencies constrain activity/travel behaviour • Activity-based approaches reflect the scheduling of activities in time and space. Activity-based approaches aim at predicting which activities are conducted where, when, for how long, with whom, the transport mode involved and ideally also the implied route decisions.
Which Activity-based transportation model? • Utility maximizing models • Sequential models (computational process models) p=predicted by the model; n=not treated in model; g=assumed given in model
ALBATROSS • Albatross: A learning based transportation oriented simulation system = activity-based model of activity-travel behavior, derived from theories of choice heuristics • Developped in the Netherlands (Arentze, Timmermans ;2000) • The model predicts which activities are conducted when, where, for how long, with whom and also transport mode • Decision tree is proposed as a formalism to model the heuristic choice Obviously, this is a crucial component of the model. The better the learning algorithm, the better the prediction…
Constraints that have been taken into account in Albatross • Situational constraints: can’t be in two places at the same time • Institutional constraints: such as opening hours • Household constraints: such as bringing children to school • Spatial constraints: e.g. particular activities cannot be performed at particular locations • Time constraints: activities require some minimum duration • Spatial-temporal: constraints an individual cannot be at a particular location at the right time to conduct a particular activity
Modelling Choice behavior • Models used to rely on utility-maximization • Albatross assumes that choice behavior is based on rules that are formed and continuously adapted through learning while the individual is interacting with the environment (reinforcement learning) or communicating with others (social learning). • As said, rules are currently derived from decision trees • Other rule-based learning algorithms can also be used
The scheduling model Components: • a model of the sequential decision making process • models to compute dynamic constraints on choice options • a set of decision trees representing choice behavior of individuals related to each step in the process model Aim: Determine the schedule (=agenda) of activity-travel behaviour a-priori defined derived from observed choice behavior Skeleton refers to the fixed and given part of the schedule Flexible activities: optional activities added on the skeleton
The sequential decision process(process model) Each oval represents a DT
The inference system in Albatross • For each decision, the model evaluates dynamic constraints • The implementation of situational, household and temporal constraints is straightforward • We will look at space–time constraints and choice heuristics determining location choices
Albatross derives DT based on Chaid-learning algorithm Use a probabilistic assignment rule. The probability of selecting the q-th response for each new case assigned to the k-th node is:where fkq is the number of training cases of category q at leaf node k and Nk the total number of training cases at that node
Performance of Albatross • The eventual goodness-of-fit of the model can be assessed only by a comparison at the level of complete activity patterns • Eventual output of Albatross is OD- trip matrices • Conclusions till here: • Use of decision trees for choice heuristics, resulting in a considerable, but varying improvement over a null model • A sample size of 2000 household-days suffices to develop a stable model • Transferability of the model to another context than in which it was developed remains to be studied
Advance the state of the art Some things what we have been doing in our research group with respect to Albatross: • Two other rule-based techniques applied in the context of the Albatross model: • Integrate Decision tree techniques and feature selection: Identify irrelevant attributes and build simple models • Build advanced complex models by means of Bayesian networks and try to improve accuracy • Use (and adapt) Albatross towards the application area of Flanders • Evaluate the performance of activity-based models versus trip-based models
Application 1: Build simple models by means of DT and feature selection • General idea: Occams’ razor: “Entities are not to be multiplied beyond necessity” Large set of attributes - likely to be correlated - larger trees, but not necessary better ! Use feature selection techniques to identify irrelevant attributes that do not significantly improve accuracy and can thus be omitted in the final model
Application 1: Empirical results • Build a DT for every decision facet in the Albatross model • Example: “location”-facet
Application 1: Empirical results Full approach FS approach
Application 1: Empirical results Model performance at activity pattern level Conclusion: There is no evidence of substantial loss in predictive power when trimmed decision trees are used to predict activity-travel patterns.
Application 2: Build complex models by means of BN and try to improve accuracy • General idea: Modelling travelling behaviour is non-trivial as it is multidimensional and complex in nature. Hidden, unknown relationships might have an impact on the final outcome • Need for a technique that is able to deal with this: Bayesian networks • Able to capture (complex) relationships between variables • Able to be learned from data • Visualize interdependences between variables • Prior and posterior probability distributions per variable • Well suited to conduct what-if scenarios and sensitivity analysis • White box
Case study on mode choice facet Steps to follow: (1) Build the network (Structural Learning), (2) Choose a target variable and prune the network, (3) Calculate probability distributions (Parameter Learning), (4) Perform what-if scenarios by entering evidences in the network
Application 2: empirical results • Conclusions: • Better predictions Reason: Unlike decision trees (CHAID), variables are selected simultaneously, no hierarchy of importance of the selected variables • Selection of the variables +/- the same in both approaches ( difference in performance more due to different nature than to additional insights) • Much larger number decision rules in Albatross compared with CHAID, however performance is also OK on the test data( additional research on other datasets is warranted) • Interpretation is an issue, BN link several variables in sometimes complex direct and indirect ways.
Application 3: Activity-based versus trip-based • Use (and adapt) Albatross towards the application area of Flanders • Evaluate the performance of activity-based models versus trip-based models • Transportation models: trip based • Mobility Plan Flanders (2003) • Predict in a static way reliable results for distribution, substitution and route effects • They cannot manage generative and temporal shiftings • Need for a more dynamic and more complete model
Activity based models Travel demand is derived from activities 24 hour schedule with activities Household interaction Time and space constraints Trip based models Just consider one-way trips Only during peak hour Individual trips Calibration is needed to fit the data to the real situation
But ... • Trip based models take the outcomes (traffic flows, passengers numbers, ... ) as input in the calibration • As expected, the outcomes are robust and fit the actual situation perfect • The influence of the calibration is much stronger than the influence of the input data
Aim: the application of an activity based model in Flanders • Albatross ► developed for the Dutch situation • First stage: use of the Dutch decision tables • Comparison of the results of the two model types and their performance on the same input data
Data • Travel behaviour study: urban region of Leuven (2001) + trip-based model • Trip schedules (no information on in-home activities) • Locations: zip code ≠ statistical sector • Assumptions: • Overestimation of car and bike availability per household • Standard values for work time • Transport mode: longest distance in the trip • Facility data: not yet available
Assumption: trip based models predict the actual situation almost perfect • ALBATROSS: • Mean length of the schedules is shorter than in the Dutch example (reason: conversion trip schedule to an activity schedule) • SAM values (parameters for Goodness-of-Fit) are very high ►predictions are not good
OD – matrices: match reasonably well • Activity type: • Good predictions for work and bring and get • Grocery and non-grocery is a problem • Length of tours • Predict too much short tours (< 2 km) • Transport mode • Too much public transport and car passengers • Too little car drivers
Predictions are not good: fortunately! • Refinement of input data/ facility data • Adaptation to the Flemish situation of the decision tables • Trip based model runs without traffic flows and passenger numbers for a real fair comparison • Run model on other Flemish regions
Some words on what else we have been doing… • An alternative approach to model activity-travel decisions is also under development at our research group • This model assumes that each diary consists of correlated successive activities. • For instance during morning: Sleep-Having Breakfast-Transportation to work • Markov chains are often used to model this type of dependences: • Transition Matrix: =First-order Markov Chain Transition Matrix: = Second-order Markov Chain Etc.
Artificial Example Diary 1: TcFFFFFFFFFFFFFFFFE Diary 2: TcEEFREREERFTcFTcFFTcFETcF Diary 3: RREFEFEETcTcR Diary 4: EEFFTcFTcFRRTcTcRTcRR Diary 5: FFTcFFRE Diary 6: EETcFRRE With Tc= Transportation, with car as transport mode, F=visit Family, E=Eat, R=Read Tc E R F Tc 0.11 0.03 0.16 0.70 E 0.23 0.40 0.08 0.29 R 0.10 0.53 0.30 0.07 F 0.21 0.20 0.28 0.31 These probabilities can be computed by means of Markov Chains
Example derived from data • Simulation procedure: Simulate Xt as a function of the values taken by Xt-1 and Xt-2 Repetitive procedure
Let’s recapture things… • Why transportation modelling? • Which kinds of transportation modelling? • Why activity-based transportation modelling? • Which activity-based transportation model? • Model Selection: Albatross • What is Albatross? • Things what we have been doing and are still going to do wrt Albatross • Introduction of an alternative modelling approach based on sequential dependencies in data (short version)