Pressing issues in social simulation: From building models to building science

Pressing issues in social simulation: From building models to building science Dawn Cassandra Parker Associate Professor, School of Planning, Faculty of Environment (July 1) Associate Director, Waterloo Centre for Complexity and Innovation University of Waterloo Waterloo, Ontario, Canada MABS session, AAMAS 2010, Toronto, CA 11 May, 2010

Some opening assertions • We have gotten better at designing and building agent-based social simulation models • More empirical models are being built, in many domains • We very much lack methods to analyze and understand the outputs of our models • This makes experimental design difficult • We remain unable to fully participate in closed-loop science

Looking back three long years--What do researchers seek from spatial ABM? • Improved understanding of process, not just identification of drivers • Development of integrated disciplinary models: land-change science and spatial social science • Understanding of how local, case-specific conditions shape local outcomes • Ability to project change and future trajectories

Theoretical and Empirical Modeling: Competing or complementary goals? Potential tensions between two goals: • Building theoretical frameworks that contribute to integrated theories; • Building better empirical models for scenario analysis and projection • A traditional scientific method perspective says that these two activities should be complementary • My optimistic view is that the potential for complementarity is greater in the simulation modeling realm • But, we have still not learned to effectively link the two approaches

Shifting the debate: • From “Should models be complex or simple?” to “How can each type of modeling be connected to the scientific method? • From “Can we recreate macro-scale outcomes using micro-scale behaviors?” to “How do particular micro-scale behaviors and interactions produce multiple macro-scale outcomes?” • From “Can we reproduce macro-scale outcomes” to “Can we reproduce relationships between micro-scale drivers and macro-scale outcomes?”

We undertake ABM to fill in process … Drivers Agent Behaviors … And, we don’t know what to do with our generated data Agent Interactions Emergent/Pattern-based Outcomes But often we only have real-word data on drivers and outcomes …

What would a skeptic say about our scientific status? • Others don’t know what our models are doing • We are not sure what questions we are asking, and how to answer them • We don’t know what our models are doing, either …

Three challenges for 2010: • Improving model communication • Improving experimental design • Improving our ability to understand the relationship between drivers and outcomes How can these improve science? Enhance transparency, evaluation, and replication Clarify research questions and their assessment Let us more clearly answer our research questions

Question 1: Improving model communication Agent-based models: • Combine qualitative and quantitative rules-cannot be expressed using standard mathematics • Draw on a wide variety of theoretical paradigms • Are implemented in a wide variety of programming languages • Are built and used by researchers with a wide variety of programming skills • UML is a partial, but not complete, solution • Open-source code helps, but who wants to read code?

Resulting problems: • Difficulty in reviewing models/assessing scientific viability • Difficulties in model comparison • Difficulty communicating with non-programmer/non-modeler audiences • Barriers to entry to new researchers • Barriers to replicability

Some proposed solutions: • Text-based communication protocols • Modeling ontologies • Standardized model meta-data • Meta-modeling front ends

Text-based communication protocols: ODD(Volker Grimm, LANDMOD 2010 presentation) • Provides a standardized model description template • Facilitates both communication, and perhaps model design • Suggested for several journals

Modeling ontologies: MR POTATOHEAD • MR POTATOHEAD (Model Representing Potential Objects That Appear in The Ontology of Human-Environmental Actions & Decisions) • Conceptual framework—building blocks and instantiations • An object-oriented description framework for OOP models

The MR POTATOHEAD conceptual design pattern • Expressed as a conceptual, object-oriented classification of elements that are essential for an ABM/LUCC model, with alternative instances • Eight ABM/LUCC models described via MR POTATOHEAD as special cases of the general meta-model • Model instantiated in Web Ontology Language with the goal of model comparison and meta-model creation • Used in part for development of the new SLUCE2 models

MP structure examples

MP- Model comparison IMSHED and LUCITA land exchange mechanisms

MR POTATOHEAD: longer-term goals • Create a simple code base that nest the 7 models (perhaps more) as special cases • Allows for formal comparison of model structures and results • Facilitates comparative experiments to explore alternative parameter spaces • Provides code base for new model creation • Build a graphical modeling front-end • Educational/Participatory modeling uses • Reduces barriers to entry to the field • Create a web-based implementation with a searchable experiments and results archive

Work in progress • Prototype model in development by Mike Livermore (Mason CSS MS student), see IEMSS 2010, forthcoming • GUI front end calls Repast Symphony code • Single model in implementation stages

More meta/communication tools in development • Open ABM consortium/COMSES network: on-line model archive supported by to-be-developed model metadata standards and searchable model metadata • Miles Parker “metascape” abm tools strive to create visual modeling tools that can translate into multiple programming languages/libraries. (See http://www.metascapeabm.com)

From Parker et a. 2008: • “A set of such libraries may allow ABM/LUCC modellers to achieve their own version of the future envisioned by pioneers in computational biology: “a future in which not just … models but all the pieces of models should be sharable. In this utopia, models should be able to swap computer code … as easily as Mr. Potato Head swaps noses.” (Krieger 2006, p. 189) “

Question 2: Improving experimental design • Challenge: How to create stylized model versions that shed light on the behavior of empirical models • Why bother? • Facilitate sensitivity analysis • Understand the generalizability of empirical results • Build a theoretical base • An example: Generating hypothetical landscapes with known, but realistic properties, to answer questions about causes and consequences of landscape structure

Challenge: how to create subdividable urban spatial representations to appropriately represent developer behavior? • Morgan and O’Sullivan: “Using binary space partitioning to generate urban spatial patterns” • Uses quadtrees to represent and partition urban space • Simple and elegant solution to generate fractal city patterns found in real-world cities

Challenge: How to develop neutral agricultural landscapes to test effects of spatial structure on process-based LUCC models? • Le Ber et. al, “Modelling neutral agricultural landscapes with tessellation methods: the GENEXP-LANDSITES software” • Can create simulated landscapes that have particular structure (tessellations) and land-use distributions

Question 3: Improving our ability to understand the relationship between drivers and outcomes • What do we do with all those model outcome data? • How can we use them to answer research questions? • My assertion: we have the ability to build a more closely integrated scientific loop, but it will be challenging

Modeling in the traditional scientific method: Mathematically expressed behavioral model Hypotheses derived via deductive mathematics or logic Empirical testing via inductive data analysis

The “third way of science”: Agent-based behavioral model Simulated data generated through multiple model runs Hypotheses derived via inductive analysis of simulated data Empirical testing via inductive data analysis

“Pseudo-inductive” analysis for theoretical models • Build hypothetical causal rules/mechanisms into ABM; • Generate database of outcomes by sweeping model parameter space; • Analyze generated data using inductive methods that use emergent macro outcomes as dependent variables, and parameter values as independent. • Hypotheses (equivalent of comparative statics or dynamics) are contained in the estimated functional relationships: how do macro outcomes change as parameters change.

Acute need for better tools to analyze data from complex systems in order to conduct such analysis • Data visualization • Thresholds • Non-linearities • Non-monotonicity • Data analysis • Non-linear relationships between parameters and outcomes • Synergies and feedbacks between parameters • Endogeneity between micro and macro

Challenge: How to analyze how alternative landscape-pattern based incentive schemes affect biodiversity provision? • Polhill, Gimona, and Gotts, IEMSS 2010: 20,000 model runs from experiments testing 6 policy regimes under a variety of economic conditions • Applied generalised additive models to examine species richness at a landscape level to reveal critical (and varying) incentive effectiveness thresholds • Also reveals variations due to changing economic parameters

Figure 1: Estimated relationships between species richness and incentives levels, 600 runs each graph

Challenge: How to analyze changes in model output due to simultaneous variations in uncertain input parameters? • Liegman-Zielinska and Sun, forthcoming, IJGIS, “Applying Time Dependent Variance-Based Global Sensitivity Analysis to Represent the Dynamics of an Agent-Based Model of Land Use Change” • Separates first-order (independent) and total effects (including interactions effects of a parameters) • Time-series analysis to examine thresholds and regime shifts • Application to residential development examining effects of heterogeneous landscape preferences, information, and opportunities on land-use morphology (fragmentation)

Results: Sensitivity Plots (Si)

Results: Sensitivity Plots (STi)

SLUCE2 Project plans: Addressing these challenges • MR POTATOHEAD used for model planning and documentation • Qingxu Huang (UW PhD) working to create hypothetical landscapes and create temporal measures of path dependence • We plan to use UM GRIDSWEEPER tools for sensitivity/model analysis • Will build on the recent exciting new work of other projects described here • Combination of theoretical and empirical models developed should provide opportunity for closed-loop science

How can theory and empirics work together? • Theoretical frameworks may be used to develop conceptual “meta-models” • Formal, structured analysis of simpler theoretical models can aid understanding of process dynamics driving empirical applications • Empirical case studies can identify outcomes for parameters/dynamics active in a particular place and time • Coupling the two may allow us to better project out of sample/range of data used to develop models—and increase our scientific credibility

Acknowledgements • SLUCE2: Dan Brown, Bill Currie, Tatiana Filatova, Joan Nassauer, Scott Page, Rick Riolo, Derek Robinson, Shipeng Sun, Qingxu Huang, and additional members of the Project SLUCE team (http://www.cscs.umich.edu/sluce/) funding for grant development from NSF BCS-0119804 , new funding from NSF CNH-0813799 • LANDMOD 2010 conference organizers • AAG 2010 complexity session organizers

References • GRIMM V, Berger U, Bastiansen F, Eliassen S, Ginot V, Giske J, Goss-Custard J, Grand T, Heinz S K, Huse G, Huth A, Jepsen J U, Jørgensen C, Mooij W M, Müller B, Pe'er G, Piou C, Railsback S F, Robbins A M, Robbins M M, Rossmanith E, Rüger N, Strand E, Souissi S, Stillman R A, Vabø R, Visser U and DeAngelis D L (2006). A standard protocol for describing individual-based and agent-based models. Ecological Modelling 198 (1-2), 115-126. • Polhill, J. G., D. Parker, D. Brown, and V. Grimm. 2008. Using the ODD Protocol for Describing Three Agent-Based Social Simulation Models of Land-Use Change. Journal of Artificial Societies and Social Simulation 11 (2-3). http://jasss.soc.surrey.ac.uk/11/2/3.html • Parker, D., D. Brown, J. G. Polhill, S. M. Manson, and P. Deadman. 2008. Illustrating a new ‘conceptual design pattern’ for agent-based models and land use via five case studies: the MR POTATOHEAD framework. Pages 29-62 in A. L. Paredes and C. H. Iglesias, eds. Agent-based Modelleling in Natural Resource Management. Universidad de Valladolid, Valladolid, Spain.

More references • Morgan, F., and D. O'Sullivan. 2009. Using binary space partitioning to generate urban spatial patterns. Paper read at 4th International Conference on Computers in Urban Planning and Urban Management, at University of Hong Kong, Hong Kong. • http://www.umr-lisah.fr/rtra-projects/pr%C3%A9sentations%20LANDMOD/Florence_LE%20BER.pdf • Ligmann-Zielinska and Sun, In Press, Applying Time Dependent Variance-Based Global Sensitivity Analysis to Represent the Dynamics of an Agent-Based Model of Land Use Change, International Journal of GIS • Polhill, Gimona, and Gotts, forthcoming, Analysis of Incentive Schemes for Biodiversity Using a Coupled Agent-Based Model of Land Use Change and Species Metacommunity Model, IEMSS 2010 • GRIDSWEEPER: http://www.cscs.umich.edu/PmWiki/Farms/CSCSSoftware/field.php/GridSweeper/GridSweeper • OpenABM: www.openabm.org

Pressing issues in social simulation: From building models to building science