1 / 19

So you think that you want to model something?

So you think that you want to model something?. Blaine Gaither blaine@acm.org. Outline. Goal: Understand the trade-offs involved in benchmarking, simulation and analytical modeling Outline Problem definition It takes two models to tango Model Level of detail Workload Characterization

gerodi
Download Presentation

So you think that you want to model something?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. So you think that you want to model something? Blaine Gaitherblaine@acm.org

  2. Outline Goal: • Understand the trade-offs involved in benchmarking, simulation and analytical modeling Outline • Problem definition • It takes two models to tango • ModelLevel of detail • Workload Characterization • Benchmarking • Model Validation • Queuing-based (analytic) models • Simulation models

  3. Problem definition • What are the questions that you really want answered? • Refine • What specific information is needed to influence the decision? • What level of confidence is needed to influence others to adopt your recommendations? • When must the decision be made? • What are the time and money constraints for this study?

  4. Problem Definition Performance Evaluation is a Decision-Making Process • Recognition of need • Problem formulation/identification • Model building • Data collection, model parameterization • Model solution • Benchmarking • Analytic • Simulation • Model validation and analysis • Results interpretation • Iteration • Decision making

  5. It takes two models to tango Workload System The most accurate model of a system is the system itself Working with real system is often impractical Reasons why? Models abstract the real system Analytic Simulation Hybrid • A benchmark is a model of a workload • The most accurate model of the workload is the workload itself • Working with the actual workload is often impractical? • Reasons why? • Workload characterization helps abstract the important workload characteristics • Benchmarks are sometimes used to model the workload

  6. Level of Detail Risk of “going Rainman”, Jay Veazey • Do you really need to model every detail? • Avoid model parameters that cannot be accurately measured We need to find the right level of abstraction Identify the key characteristics of the: • Workload • For OLTP it might include IO rate, instructions executed per transaction, and lock contention rate, … • For system • For OLTP it might be service rates, latencies and ability to process lock contention The rest is just a distraction

  7. Benchmarking, Simulation and Analytical Modeling

  8. Benchmarking types and pitfalls • Real application? • Includes all I/O? • Real inputs? • Repeatability? • Can you scale the inputs • Real hardware? • Kernel program? • May exclude I/O, loops, subroutines, … • E.g., SPEC CPU • Benchmark program? • Scaled-down application? • Does it still exercise scaling bottlenecks? • Synthetic behavior? • E.g., TPC-C uses: • Real Database code (Oracle, SQL) • Synthetic schema and data (Models hypothetical warehouse) • Synthetic workload (Models users)

  9. Workload characterization Measure real systems to collect: • Workload parameters for your model • Critical aspects of the workload for making the decision • Examples: • Transaction types and rates • Number of users • IO rate • IO block size • Instructions executed per transaction, … • Remember we may need to scale the workload up or down for specific model scenarios • Measurement of operational variables is preferred. • Variables that can be established beyond doubt by measurement are called operationally testable. • GIGO • Data to help validate your model • Throughput • Response times • Utilizations • Queue lengths, …

  10. Validation • Don’t just look at the predicted performance metric • Compare known (validation) cases for: • proper queue lengths, • number of visits and • utilizations on as many components as you can. Understand deviations.

  11. Validation • Never trust model results until validated • Are the results reasonable?? • Sources of error • Wrong workload • Poor workload characterization • Missed a key aspect of the workload • Measurement error • Improperly scaled the workload for the new situation • Benchmarking • Instrumentation can perturb system (Measurement artifact) • Not really the system we want to measure! • Analytical model • (Symptom) Improper queue lengths on validation cases • Not enough detail or there are software blockages • Simulation • Programming errors? • Too much detail • A detailed model requires more workload assumptions which are subject to error • Are the random numbers really random? • Untested corner cases? • High value decisions may merit cross checking between more than one approach

  12. Queueing-based models • Open Queueing Networks • Acyclic network of queues • Uses Markovian models, M/M/n, M/G/1 … • Closed queueing network • Mean Value analysis* *P. J. Denning and J. P. Buzen, Computing Surveys, Vol 10, No. 3, September 1978 http://cs.gmu.edu/cne/pjd/PUBS/csurv-opanal.pdf

  13. Example of an Open Queueing Network Approach Environment. • Limited resources and short time-frames. • External chip-sets and CPUs. • Never know enough detail, soon enough. • Not time to make decisions based upon detailed simulation. • Concentrate on an accurate understanding of workloads: • TPC-C, TPC-E, TPC-H, … Three components: • Characterize memory/interconnect workloads and path-length. • Parameters include: • Memory size, cache sizes, coherency/cache/TLB behavior, instructions/trans, memory accesses per transaction. … • Use CPI models to model CPU core throughput. • Modeled at Xbar interface. • Parameters include. • CPU GHz, • Database size, path lengths, kernel vs. user, … • Use queueing models to model northbridge andchip-sets. • Parameters. • Memory organization and speed. • Link speed and configuration, ... • Coherency protocol design.

  14. CPU Core Throughput = F(Memory Latency, Cache Size, Memory Size, Path length, …) Calculate Cycles per Instruction at memory latency Determine throughput as function of CPI, clock frequency and instruction path length/transaction

  15. Chip-set Latency = F(Demand., …) Determine # visits to each resource, and resource utilizations for load Then, sum service times and queueing delays, that impact memory latency Dependent axis

  16. Solve Balance Equations Solution Point Typical accuracy ± 2-3%

  17. Simulation models • Simulation • Types of simulations for use in capacity planning • Transaction oriented • model from the point of view of the transaction visiting services • Process oriented • modeled from the point of view of either transactions or servers or both • Workload source • Trace-driven – perhaps traces of real system activity • Stochastic – Use of random number generators • Statistical tools can be used to: • Reduce the simulation time. • Confidence intervals • Determine whether a change made to a system has a statistically significant impact on performance

  18. CSIM example MM1 queue { create("sim"); fp = fopen("mm1.out", "w"); set_output_file(fp); // construct facility q = new facility("q"); // Construct stat collection object resp_time = new box("resp time"); while(clock < simTime) { customer(); // invoke customer process hold(exponential(interArrival)); } report(); // create report } void customer() // customer process { TIME t1; create("customer"); // create customer t1 = resp_time->enter(); //Stat gather q->use(exponential(serviceTime)); //use facility for service time resp_time->exit(t1); // Stat gather } Average service time s*r/(1- r) Where r is utilization and s is service time

  19. Texts by these authors are great • The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation,... by R. K. Jain • The Practical Performance Analyst by Neil Gunther  • Performance by Design: Computer Capacity Planning By Example by Daniel A. Menasce, Lawrence W. Dowdy and Virgilio A.F. Almeida  • Fundamentals of Performance Modeling by Michael K. Molloy • Getting Started: CSIM Simulation Engine (C++ Version) • Herb Schwetman, CSIM19: A POWERFUL TOOL FOR BUILDING SYSTEM MODELS , Proceedings of the 2001 Winter Simulation Conference, B. A. Peters, J. S. Smith, D. J. Medeiros, and M. W. Rohrer, eds

More Related