370 likes | 831 Views
In the Name of the Most High . Performance Evaluation of Computer Systems Introduction. By Behzad Akbari Tarbiat Modares University Spring 2009. Outline. Introduction to performance evaluation Objectives of performance evaluation Techniques of performance evaluation
E N D
In the Name of the Most High Performance Evaluation of Computer SystemsIntroduction By Behzad Akbari Tarbiat Modares University Spring 2009
Outline • Introduction to performance evaluation • Objectives of performance evaluation • Techniques of performance evaluation • Metrics in performance evaluation
Introduction • Computer system users, administrators, and designers are all interested in performance evaluation. • The goal in system performance evaluation is to provide the highest performance at the lowest cost. • Computer performance evaluation has important role in selection of computer systems, design of systems and applications, and analysis of existing systems.
Objectives of Performance Study • Evaluating design alternatives (system design) • Comparing two or more systems (system selection) • Determining the optimal value of a parameter (system tuning) • Finding the performance bottleneck (bottleneck identification) • Characterizing the load on the system (workload characterization) • Determining the number and sizes of components (capacity planning) • Predicting the performance at future loads (forecasting).
Basic Terms • System:Any collection of hardware, software and network. • Metrics:Criteria used to analysis the performance of the system or components. • Workloads:The requests made by the users of the system.
Performance Evaluation Activities • Performance evaluation of a system can be done at different stages of system development • System in planning and design stage • Use high level models to obtain performance estimates for alternative system configurations and alternative designs. • System is operational • Measure the system behavior with a view to improve the performance • Develop validated model that can be used for performance prediction and capacity planning.
Techniques for Performance Evaluation • Performance measurement • Obtain measurement data by observing the events and activities on an existing system • Performance modeling • Represent the system by a model and manipulate the model to obtain information about system performance
Performance Measurement • Measure the performance directly on a system • Need to characterize the workload placed on the system during measurement • Generally provide the most valid results • Nevertheless, not very flexible • May be difficult (or even impossible) to vary some workload parameters
Performance Modeling • Model • An abstraction of the system obtained by making a set of assumptions about how the system works • Capture the essential characteristics of the system • Reasons of using models • Experimenting with the real system may be • too costly • too risky, or • too disruptive to system operation • System may only be in the design stage
Performance Modeling • Workload characterization • Capture the resource demands and intensity of the load brought to the system • Performance metrics • The measure of interest, such as mean response time, the number of transactions completed per second, the ratio of blocked connection requests, etc.
Performance Modeling • Solution methods • Analytic modeling • Simulation modeling
Analytic Modeling • Mathematical methods are used to obtain solutions to the performance measures of interest • Numerical results are easy to compute if a simple analytic solution is available • Useful approach when one only needs rough estimates of performance measures • Solutions to complex models may be difficult to obtain
Simulation Modeling • Develop a simulation program that implements the model • Run the simulation program and use the data collected to estimate the performance measurement of interest • A system can be studied at an arbitrary level of detail • It may be costly to develop and run the simulation program
Stochastic Model • Model contains some random input components which are characterized by probability distributions, e.g., time between arrivals to a system by exponential distribution • Output is also random, and provides probability distributions of the performance measures of interest
Queuing Model • The most commonly used model to analyze the performance of computer systems and networks. • Single queue: models a component of overall system, such as CPU, disk, communication channel • Network of queues: models system components and their interaction.
Commonly Used Performance Metrics • Response Time • Turn around time • Reaction time • Stretch factor • Throughput • Operations/second • Jobs per second • Requests per second • Millions of Instructions Per Second (MIPS) • Millions of Floating Point Operations Per Second (MFLOPS) • Packets Per Second (PPS) • Bits per second (bps) • Transactions Per Second (TPS • Efficiency • Utilization
Commonly Used Performance Metrics (Cont…) • Reliability • R(t) • MTTF • Availability • Mean Time to Failure (MTTF) • Mean Time to Repair (MTTR) • MTTF/(MTTF+MTTR)
User’s Request System’s Response Time Response Time • Interval between user’s request and system response
Response Time (cont…) System Starts Execution System Starts Response User Starts Request User Finishes Request System Finishes Response • Can have two measures of response time • Both ok, but 2 preferred if execution long Time Reaction Time Response Time 1 Response Time 2
Response Time (cont…) • Turn around time: time between submission of a job and completion of output • For batch job systems • Reaction time: Time between submission of a request and beginning of execution • Usually need to measure inside system since nothing externally visible • Stretch factor: ratio of response time at load to response time at minimal load • Most systems have higher response time as load increases
Throughput • Rate at which requests can be serviced by system (requests per unit time)
Efficiency Number of Processors Efficiency • Ratio of maximum achievable throughput (ex: 9.8 Mbps) to nominal capacity (ex: 10 Mbps) 98% • For multiprocessor systems, ratio of n-processor to that of one-processor (in MIPS or MFLOPS)
Utilization • Typically, fraction of time resource is busy serving requests • Time not being used is idle time • System managers often want to balance resources to have same utilization • Ex: equal load on CPUs • But may not be possible. Ex: CPU when I/O is bottleneck • Maynot be time • Processors: busy / total • Memory: fraction used / total
Miscellaneous Metrics • Reliability • Probability of errors or mean time between errors (error-free seconds) • Availability • Fraction of time system is available to service requests (fraction not available is downtime) • Mean Time To Failure (MTTF) is mean uptime • Useful, since availability high (downtime small) may still be frequent and no good for long request
Definition of Reliability • Recommendations E.800 of the International Telecommunications Union (ITU-T) defines reliability as follows: • “The ability of an item to perform a required function under given conditions for a given time interval.” • In this definition, an item may be a circuit board, a component on a circuit board, a module consisting of several circuit boards, a base transceiver station with several modules, a fiber-optic transport-system, or a mobile switching center (MSC) and all its subtending network elements. The definition includes systems with software.
Basic Definitions of Reliablity • Reliability R(t): X : time to failure of a system F(t): distribution function of system lifetime • Mean Time To system Failure: f(t): density function of system lifetime
Definition of Availability • Availability is closely related to reliability, and is also defined in ITU-T Recommendation E.800 as follows: "The ability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval, assuming that the external resources, if required, are provided." • An important difference between reliability and availability is that reliability refers to failure-free operation during an interval, while availability refers to failure-free operation at a given instant of time, usually the time when a device or system is first accessed to provide a required function or service
Availability (Cont…) • Instantaneous (point) Availability A(t): A(t) = P (system working at t) Let H(t) be the convolution of F and G: • g(t): density function of system repair time Then: Inst. Availability , , Reliability
Availability (Cont…) Never failed in (0,t), prob: R(t) • System working at time t First failed and got repaired at time x<t & UP at end of interval (x,t), prob: x + dx t x 0 First repair completed here
Availability (Cont…) • MTTR: Mean Time to Repair • Y: repair period of the system • Availability and Reliability are related but different!
Availability (Cont…) • We can show from equation (1) that: • Also:
Three Rules of Validation • Do not trust the results of a simulation model until they have been validated by analytical modeling or measurements. • Do not trust the results of an analytical model until they have been validated by a simulation model or measurements. • Do not trust the results of a measurement until they have been validated by simulation or analytical modeling.