370 likes | 378 Views
Introduction to Simulation. Andy Wang CIS 5930-03 Computer Systems Performance Analysis. Simulations. Useful when the system is not available Good for exploring a large parameter space However, simulations often fail Need both statistical and programming skills Can take a long time.
E N D
Introduction to Simulation Andy Wang CIS 5930-03 Computer Systems Performance Analysis
Simulations • Useful when the system is not available • Good for exploring a large parameter space • However, simulations often fail • Need both statistical and programming skills • Can take a long time
Common Mistakes Common Mistakes • Inappropriate level of detail • More details more development time more bugs more time to run • More details require more knowledge of parameters, which may not be available • E.g., requested disk sector • Better to start with a less detailed model • Refine as needed
Common Mistakes • Improper language • Simulation languages • Less time for development and statistical analysis • General-purpose languages • More portable • Potentially more efficient
Common Mistakes • Invalid models • Need to be confirmed by analytical models, measurements, or intuition • Improperly handled initial conditions • Should discard initial conditions • Not representative of the system behavior • Too short simulations • Heavily dependent on initial conditions
Common Mistakes • Poor random number generators • Safer to use well-known ones • Even well-known ones have problems • Improper selection of seeds • Need to maintain independence among random number streams • Bad idea to initialize all streams with the same seed (e.g., zeros)
Other Causes of Simulation Analysis Failure • Inadequate time estimate • Underestimate the time and effort • Simulation generally takes the longest time compared to modeling and measurement • Due to debugging and verification • No achievable goal • Needs to be quantifiable
Other Causes of Simulation Analysis Failure • Incomplete mix of essential skills • Project leadership • Modeling and statistics • Programming • Knowledge of modeled system • Inadequate level of user participation • Need periodic meetings with end users
Other Causes of Simulation Analysis Failure • Obsolete or nonexistent documentation • Inability to manage the development of a large, complex computer program • Needs to keep track of objectives, requirements, data structures, and program estimates • Mysterious results • May need more detailed models
Terminology Terminology • State variables: the variables whose values define the state of the system • E.g., length of a job queue for a CPU scheduler • Event: a change in the system state
Static and Dynamic Models • Static model: time is not a variable • E.g., E = mc2 • Dynamic model: system state changes with time • CPU scheduling
Continuous and Discrete-time Model Continuous-time model Discrete-time model System state is defined only at instants in time • System state is defined at all times Number of students attending this class Time spent executing a job Time Time Tuesdays and Thursdays
Continuous and Discrete-state Model Continuous-state model Discrete-state model Use discrete state variables • Use continuous state variables Time spent executing a job Number of jobs Time Time • Possible to have all four combinations of continuous/discrete time/state models
Deterministic and Probabilistic Model Deterministic model Probabilistic model Gives a different result for the same input parameters • Output of a model can be predicted with certainty output output input input
Linear and Nonlinear Models Linear model Nonlinear model Otherwise • Output parameters are linearly correlated with input parameters
Stable and Unstable Models Stable model Unstable model Otherwise • Settles down to a steady state
Open and Closed Models Open model Close model No external input • Input is external to the model and is independent of the model
Computer System Models • Generally • Continuous time • Discrete state • Probabilistic • Dynamic • Nonlinear
Selecting a Language for Simulation Selecting a Language for Simulation • Simulation language • General-purpose language • Extension of general-purpose language • Simulation package
Simulation Languages • Have built-in facilities • Time advancing • Event scheduling • Entity manipulation • Random-variate generation • Statistical data collection • Report generation • Examples: SIMULA, Maisie, ParSEC
General-purpose Languages • C++, Java • No need to learn a new language • Simulation languages may not be available • More portable • Can be optimized
Extensions of General-Purpose Languages • Provide routines commonly required in simulation • Examples: CSim, NS-3 (OTcl + C++)
Simulation Packages • Provide a library of data structures, routines, algorithms • Significant time savings • Can be done in one day • However, not flexible for unforeseen scenarios
Types of Simulations Types of Simulations • Emulation • Hybrid simulation • Monte Carlo simulation • Trace-driven simulation • Discrete-event simulation
Emulation and Hybrid Simulation • Emulation • A simulation using hardware/firmware • Hybrid simulation • A simulation that combines simulation and hardware • E.g., a 5-disk RAID • One simulated disk • Four real disks
Monte Carlo Simulation • A type of static simulation • Models probabilistic phenomenon • Can be used to evaluate nonprobabilistic expressions • E.g., use the average of estimates to evaluate difficult integrals
Trace-Driven Simulation • Trace: a time-ordered record of events on a real system • Needs to be as independent of the underlying system as possible • Storage-level trace may be specific to the cache replacement mechanisms above, the working set, the memory size, etc.
Advantages of Trace-Driven Simulation • Credibility • Easy validation • Just compare measured vs. simulated numbers • Accurate workload • Preserves the correlation and interferences effects
Advantages of Trace-Driven Simulation • Less randomness • Deterministic input • Less variance • Fewer number of runs to get good confidence • Fairer comparison (deterministic input) • For different alternatives • Similarity to the actual implementation
Disadvantages of Trace-Driven Simulation • Complexity • More detailed simulation to take realistic trace inputs • Representativeness • Trace from one system may not be representative of the workload on another system • Can become obsolete quickly
Disadvantages of Trace-Driven Simulation • Finiteness • A trace of a few minutes may not capture enough activity • Single point of validation • Algorithms optimized for one trace may not work for other traces • Trade-off • Difficult to change workload characteristics
Discrete-Event Simulation • Uses discrete-state model • May use continuous or discrete time values
Common Components • Event scheduler • E.g., schedule event X at time T • Simulation clock • A time-advancing mechanism • Unit time: Increments time by small increments • Event-driven: Increments time automatically to the time of the next earliest event
Common Components • System state variables • Event routines (handlers) • Input routines • E.g., number of repetitions • Report generator • Initialization routines • Beginning of a simulation, iteration, repetition
Common Components • Trace routines (for debugging) • Should have an on/off feature • Snapshot/continue from a snapshot • Dynamic management • Main program
Event-Set Algorithms • How to track events • Ordered linked list (< 20 events) • Indexed linked list (20 – 120 events) • Calendar queue • Tree structure (> 120 events) • E.g., heap