1 / 43

Synthesizing Agents and Relationships for Land Use / Transportation Modelling

Synthesizing Agents and Relationships for Land Use / Transportation Modelling. Lecture Outline. Introduction Previous Work Data New Methods Results. Introduction. How would land use, transportation patterns and emissions react to... High congestion charge? Greenbelt policy?

torn
Download Presentation

Synthesizing Agents and Relationships for Land Use / Transportation Modelling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synthesizing Agents and Relationships for Land Use / TransportationModelling

  2. Lecture Outline • Introduction • Previous Work • Data • New Methods • Results

  3. Introduction • How would land use, transportation patterns and emissions react to... • High congestion charge? • Greenbelt policy? • “Do nothing” while population grows • Major transportation projects • Major extrapolations from current behaviour • Too hard to predict conventionally

  4. Introduction Traditional 4-stage

  5. Introduction Integrated Land Use/Transportation Environment (ILUTE) model

  6. Introduction • We can’t build such a complicated model using conventional methods • Instead, preferred approach is microsimulation model • What is microsimulation?

  7. Introduction Conventional Model Simulation Model

  8. Introduction • Microsimulation = Simulation + Agents • Models the state of agents • Combined behaviour of agents yields system state • 1. Begin with initial population in start year • 2. Update population, year by year • age persons, change family structures • change jobs, move homes • use this to predict annual travel patterns • 3. Obtain travel patterns in forecast year

  9. Introduction • Need an initial population in the start year • List of agents and their attributes - e.g., • Number of persons, and their ages • Number of vehicles • Type of dwelling • etc. • But - complete list is unknown • “Population Synthesis” used instead • Use known data to create initial agents • Result has known statistical properties • Best estimate from limited data

  10. Introduction • My results: • Improved method for population synthesis • Allows more attributes for each agent • New method for relationship synthesis • Allows correct set of agents and correct set of relationships • Created a synthetic population for ILUTE • Persons, families, households and dwellings • Complete 1986 population for GTHA

  11. Previous Work • Two representations of set of agents • Listof agents and their attributes (as categories)‏ • Contingency table • One cell for each combination of attributes • Cell contains count of number of agents

  12. Previous Work • Data Limitations • Patchwork of partial data • Mostly, we have one-way margins • Break down of a single attribute into a few categories • Example: look at how we can use one-way margins

  13. Previous Work

  14. Previous Work • Iterative Proportional Fitting

  15. Previous Work • Iterative Proportional Fitting

  16. Previous Work • Iterative Proportional Fitting • e.g., “Biproportional Updating” of O/D tables • Exactly satisfies target margins • Also minimizes discrimination information relative to source population • Information theory: maximum entropy • Resulting PDF satisfies the constraints without assuming any information we do not possess

  17. Previous Work • Many options for margins in 3D

  18. Previous Work • Beckman, Baggerley & McKay (1996)‏ • State-of-the-art application of IPF for census • Geography attribute gets special treatment • Due to nature of data in PUMS and census tables • Two approaches: zone-by-zone, or all zones at once • Treats final table as a PMF • Monte Carlo draws used to integerize • Hurts fit to target margins • Limited number of attributes

  19. Previous Work • Williamson, Birkin and Rees (1998)‏ • Not IPF: “Combinatorial Optimisation” • List-based, instead of tables • Pros: • good fit to target margins • may handle more attributes • Cons: • no guarantees about relationship with source sample • not entropy maximizing • slow

  20. Data • Summary Tables • Usually one attribute, by zone (2D margin)‏ • Contingency table • Large sample: 20% or 100% • Sometimes 2-3 attributes by zone • Used as Target Margins • Public Use Microdata Sample (PUMS)‏ • List; almost all attributes, except zones • Small sample (1-2%)‏ • Canada: defined for each large Census Metropolitan Area (CMA)‏ • Used as Source Sample

  21. Data

  22. Data

  23. Data

  24. Data • Canadian Census includes three PUMS • Persons • Census families • Households & Dwellings • Also summary tables related to each

  25. New Methods: Sparsity • Beckman et al.’s approach doesn’t work well with many attributes • Computation becomes hard • Huge memory requirement • Slow • Thirteen attributes on family agent: • Beckman Zone-by-Zone needs 1.4 GB memory • Beckman Multizone needs 1,036 GB memory

  26. New Methods: Sparsity • Number of cells in multiway table grows exponentially with number of attributes (dimensions)‏

  27. New Methods: Sparsity

  28. New Methods: Sparsity • Large number of bins • Most bins are zero • Number of bins is larger than sample!

  29. New Methods: Sparsity • Is it meaningful to use many attributes? • Tentatively, yes • Not a meaningful 13-way distribution • But, a link between many statistically valid low-order distributions (e.g., 3-way)‏ • If acceptable, can we do better than standard IPF? • Yes - use a sparse data structure instead of a complete array to represent table • Store only non-zero cells in table

  30. New Methods: Sparsity • Same representation as Williamson’s “Combinatorial Optimisation” • But, uses IPF algorithm • Maximum entropy guarantee; fast • Can implement either zone-by-zone or multizone IPF using sparse data structure

  31. New Methods: Relationships • Land use/transportation models have more types of agents • Agents: Persons, families, households, business establishments • Objects: Vehicles, dwellings

  32. New Methods: Relationships • Need to synthesize correct relationships • Examples: • Which persons are married? • Opposite sex, similar ages - usually • Which household owns/rents a given dwelling? • Number of rooms and number of persons should be correlated • Earlier methods could guarantee correct PDF for one agent type, but not all simultaneously

  33. New Methods: Relationships • Family PUMS contains information about persons in family • husband/wife ages; child ages • Can synthesize “family” agent • Include some “person” attributes in family

  34. New Methods: Relationships • Then, conditionally synthesize persons on family attributes • IPF result is a joint probability mass functionP(AGE, EDU, INCOME, OCCUP, SEX, ...) • Can convert to a conditional PMFP(EDU, INCOME, OCCUP, ... | AGE, SEX) • Synthesize, repeating for husband, wife, children

  35. New Methods: Relationships • Guarantees good fit for both agent types • Correct Family PDF • Correct Person PDF • Simple, data-driven • No rules • No special data sources, models • Provided that attributes can be aligned between agents

  36. Results

  37. Results

  38. Results • Programmed in R • A statistical programming platform • Dynamic language, fast prototyping • Good support for categorical data, contingency tables • Toronto CMA: 1.1 million households, 1.0 million families, 3.3 million persons • Run time: 2 hours, 7 minutes on older 1.5GHz computer • Repeated for Hamilton and Oshawa CMAs

  39. Results

  40. Results • Experiment • Is there value in using really rich input data? • Or does PUMS + 1D tables give enough? • Calculated fit against all available data • SRMSE and G2 information theoretic statistics

  41. Results

  42. Results • Improvement of result with additional data evident • However, no statistical tests possible • Monte Carlo stage causes some error • My conditional synthesis introduces small amount of additional error • Little difference between zone-by-zone and multizone methods

  43. Questions?

More Related