710 likes | 874 Views
Pilot Validation Methodology for Agent-Based Simulations Workshop WHERE ARE WE?. Dr. Michael Bailey Operations Analysis Division Marine Corps Combat Development Command 01 October 2007. WHO ARE WE?. Analysts Developers Accreditors of Simulations Planners, Trainers, Experimenters
E N D
Pilot Validation Methodology for Agent-Based Simulations WorkshopWHERE ARE WE? Dr. Michael Bailey Operations Analysis DivisionMarine Corps Combat Development Command 01 October 2007
WHO ARE WE? • Analysts • Developers • Accreditors of Simulations • Planners, Trainers, Experimenters • Academics
BRIEF RECAP • Can I use this ABS to support the Scientific Method? • Trips around the “O-Course” • What’s an Agent-based Simulation? • produces surprises, emergent behavior • focus on Irregular Warfare applications • What is Validation? • PROVIDE SUPPORT TO ---- Is the simulation useful in answering the analytical question?
FINDINGS • Overall validation framework • Decomposition of the process of simulation development • Basis for declaring a simulation inappropriate • Framework for analysis • Matching analysis and simulation
THIS WORKSHOP • Present the framework • Attempt to apply it • Learn in the process Your contributions are critical to our success
Pilot Validation Methodology for Agent-Based Simulations WorkshopIntroduction to the Pilot ABS Validation Methodology Mr. Edmund Bitinas Northrop Grumman 01 October 2007
Agenda • Goals of Framework & Desired Result • What Is Missing From Current V&V Process • Theory of Validation • Lunch • Framework for Validation • Sample Methodology Approach • Break • Open Discussion
Focus of Pilot Framework • Applicable to all models/simulations • Specifically developed for agent-based simulations (ABSs) and irregular warfare (IW) applications
Types of Model Validation • Expected value, physics-based simulations • Verifiable through experimentation • Random effects introduce predictable error • Stochastic, probability-based models • Distribution of model outcomes matches the distribution of observed outcomes • Student T test, others • Model-generated and observed distributions are identical if they cannot statistically be proven otherwise • Probability of being correct
ABS Validation Limitations • Agents attempt to replicate, at least in part, the human decision making process • Humans may have more information than the Agents • Humans may include emotions and experience • Humans may think/plan ahead • Humans may anticipate the actions of others • Two humans, given the same information, may make different decisions • Thus, traditional validation may not be meaningful
Additional Complications • Traditional models can be validated for a class of problems • e.g., Campaign models • Some ABSs are not models of anything in particular • Agent behaviors and capabilities are assigned by the user, via input, for a specific application • The software merely executes the input behaviors • Examples: Pythagoras, MANA, others • Question: How much behavior can be/needs to be reproduced?
What Constitutes ABS Validity? • An ABS may be valid: • For a specific application • Over a limited range of inputs • If the decisions it makes could be made in real life • If the emerging complex behavior can be traced to a realistic root cause(s) • But, an ABS is NOT valid if one can prove that it is invalid • Trying to invalidate an ABS for an application (and failing to do so) may result in lower risk in using the ABS for the application
Framework Goals • Determine the required accuracy • What is sufficient accuracy for the intended application? • Find techniques for uncovering invalid models • Validation may not be possible • Establish the boundaries of validity • May limit applicability to only a portion of the intended use • Ensure the process is not resource intensive • Accomplish the process with a small fraction of total resources available for the application
Desired Result • Develop a framework process that is: • Transparent • Traceable • Reproducible • Communicable
Pilot Validation Methodology for Agent-Based Simulations WorkshopTheory of Validation Dr. Eric Weisel WernerAnderson, Inc. 01 October 2007
Basic Questions in Simulation Science • What is simulation? • Basic structures • Properties of those structures • How is a simulation related to other … ? • Abstraction • Validity • Fidelity … • Objective: Useful theorems about simulation
Objectives of Simulation Science • Useful theorems about simulation • Properties of structures • Capabilities and limitations of simulation • Are there systems which cannot be simulated • Time complexity • Interoperability and composability
Objectives of Simulation Science Foundational sciences Mathematics Computability theory Logic Model theory Systems theory Not reinventing basic structures of foundational sciences 19
Objectives of Simulation Science Common approach to feasibity is to try to build it A better way – build useful theorems about simulation Properties of structures Capabilities and limitations of simulation Time complexity Interoperability and composability 20
Survey of Theoretical Framework Model A model is a physical, mathematical, or otherwise logical representation of a system, entity, phenomenon or process (Department of Defense 1998) 21
Survey of Theoretical Framework Simulation Simulation is a method for implementing a model over time. Simulation also is a technique for testing, analysis or training in which real world systems are used, or where a model reproduces real world and conceptual systems. (Department of Defense 1998). 22
Survey of Theoretical Framework Labeled Transition System 23
Survey of Theoretical Framework Simulation 24
Comparison of transition systems Model behaves in a similar way to a natural system 25
Comparison of transition systems Simulation is one-way bisimulation means TM simulates TI (TM is valid) 26
Simulation Relations Equivalence 27
Simulation Relations Metric 28
Composition of Models Validity of composition of models Show that a simulation relation exists for composition of valid models 29
Composition of Models There may be surprises in the underlying theory at foundation of simulation 30
Verification and Validation Summary Validation There are three key elements embedded within the U.S. DoD validation definition:(1) accurate(2) real world(3) intended use Validation is the process of determining the degree to which a model and its associated data are an accurate representation of the real world from the perspective of the intended uses of the model.(DODD 5000.1 and DODI 5000.61) 31
The V&V Continuum We want to have confidence (or lack thereof) that our model represents the “real world” Conjecture: demonstrating mathematically thatis intractible at best undecideable at worst 2 paths to get there Since we can’t prove validity, we must rely on the scientific method to build confidence/assess risk
Theory Supports Framework Validation question: Assessment of risk of Type II error in application of scientific method Null hypothesis: Abstraction (ideal sim) and simulation relation (R) capture formal representation of intended/specific use
Matching Tool to Application The Road Ahead: Build classes of models and simulation relations such that:
Pilot Validation Methodology for Agent-Based Simulations WorkshopTen Minute BREAK 1100 - 1110 Please Return To Auditorium
Pilot Validation Methodology for Agent-Based Simulations WorkshopFramework for Validation Dr. Eric Weisel and Ms. Lisa Jean Moya WernerAnderson, Inc. 01 October 2007
Typical Agent DMSO, VV&A Recommended Practices Guide – Human Behavioral Representation (HBR) Special Topic Moya & Tolk, Toward a Taxonomy of Agents & MAS
Typical Definitions U.S. DoD: Validation is the process of determining the degree to which a model and its associated data are an accurate representation of the real world from the perspective of the intended uses of the model. U.K. Ministry of Defence: Validation – To establish that the model / process is fit for purpose ASME / AIAA: The process of determining the degree to which a model is an accurate representation of the real world from the perspective of the intended use of the model. DOE: The process of determining the degree to which a computer model is an accurate representation of the real world from the perspective of the intended model applications IEEE: The process of evaluating a system or component during or at the end of the development process to determine whether it satisfiesspecified requirements.
Three Main Elements • The model • The thing being simulated • “Real world” • Empirical data • Referent • Abstraction • Set of bounding principles • Accuracy requirements • Intended use
Three main elements U.S. DoD: Validation is the process of determining the degree to which a model and its associated data are an accurate representation of the real world from the perspective of the intended uses of the model. • The model • The thing being simulated • “Real world” • Empirical data • Referent • Abstraction • Set of bounding principles • Accuracy requirements • Intended use
Physics based modeling • Conceptual model validation • Mathematical equations • Difference equation solution algorithm • Results validation • Empirical data • Experimental testing • Predictive capabilities • Acceptable error tolerance
AgentValidation DMSO, VV&A Recommended Practices Guide – Human Behavioral Representation (HBR) Special Topic Moya & Tolk, Toward a Taxonomy of Agents & MAS • Little empirical data • Evaluate • Conceptual model design • Knowledge Base • Engine and Knowledge Base implementation • Integration with simulation environment
Spiral Methodology for Invalidating an ABS Assess risk of using the ABS for the specific/ intended use Specific use = Applying the ABS for a specific purpose user-centric Intended use = Developing the ABS for a specific reason developer-centric Communicate that risk to the consumer of the results of the ABSVal process Apply scientific method using invalidation techniques Ideally performed at each step in process Realistically, given resource constraints, conduct cost-benefit tradeoff to determine techniques that: 1. Will invalidate the ABS quickly, or 2. Will provide a significant reduction in risk in using the ABS for the specific/intended use
Application of Scientific Method Apply: Invalidation techniques with highest cost-benefit tradeoffs Apply additional invalidation techniques as resources allow Each technique will add to or subtract from the level of risk Communicate reasoning behind techniques chosen and areas of process chosen for application of the scientific method If null hypothesis rejected at any point, the ABSVal process is done Assuming the reason for rejection cannot be easily fixed or modify use If null hypothesis not rejected, some decreased degree of risk can be conveyed to the consumer If null hypothesis not rejected, but the ABSVal performer does not have a high degree of confidence in the validity of a given piece The ABSVal performer can attempt to use another technique to invalidate that particular piece, and/or Can convey a higher level of perceived risk to the consumer
The V&V Continuum We want to have confidence (or lack thereof) that our model represents the “real world” Conjecture: demonstrating mathematically thatis intractible at best undecideable at worst 2 paths to get there Since we can’t prove validity, we must rely on the scientific method to build confidence/assess risk
Assessment of Risk Based on Utility Theory Ti = Technique of validation For each {Ti}, have a Risk of Type II Failure R({Ti}) and a Cost C({Ti}) For each Ti: Impact of Type II Failure: I (Ti) ~ VI(I) Likelihood of Type II Failure: L (Ti) ~ VL(L) Risk of Type II Failure: R {Ti} = f(I, L) R{Ti} = wIV(I) + wLV(L)
Communicating Risk By communicating the level of perceived risk in an ABS after failing to invalidate it, the consumer is provided with a means for assessing the ABS’s applicability to hard-to-quantify, non-traditional areas or activities, such as Irregular Warfare (IW) The consumer can make an informed decision on whether to use the ABS for a given specific purpose/intended use given The fact that the ABS was not proven invalid The number and type of techniques that were applied at each step in the process, and The degree of risk that the ABSVal performer perceives It is important to note that the ABSVal performer is not communicating that the ABS is valid – but rather that the ABS was not proven to be invalid The ABSVal performer is providing sufficient evidence that supports that the ABS is adequate for the specific/intended use
Invalidation Techniques • Comparison to other models • Turing test • Intuition • Intuition • Functionality assessment • Completeness assessment • Formal Methods • Algorithm review • Input range validation • Control parameter review • Component testing • SME validation • Executable compared to concept/referent • Results validation • Mini-analysis • Accuracy • Theoretical model compared to concept/referent • Assumption testing • SME Review • Mathematical model compared to theoretical model • Boundary analysis • Algorithm review • Spreadsheet Modeling • Software code compared to mathematical model • Existence of required outputs • Symbolic debugger • Code walk through • Data compared to executable • Existence of required inputs