110 likes | 293 Views
EGRE 426. Computer Organization and Design Chapter 4. Performance. Performance is important! Often determines viability of the hardware software system. Consider running windows XP on a PC with performance of the original IBM PC (4.8 MHz clock) Determining performance can be difficult.
E N D
EGRE 426 Computer Organization and Design Chapter 4
Performance • Performance is important! • Often determines viability of the hardware software system. • Consider running windows XP on a PC with performance of the original IBM PC (4.8 MHz clock) • Determining performance can be difficult. • Response Time (latency) — How long does it take for my job to run? — How long does it take to execute a job? — How long must I wait for the database query? • Throughput — How many jobs can the machine run at once? — What is the average execution rate? — How much work is getting done?
Determining performance can be difficult • Instruction execution times. • When a salesman quotes a MIPS (millions of instructions per second) value he is guaranteeing that the machine will not run faster than that value. • Benchmarks programs are useful but can produce misleading results. • Benchmarks may depend on small sections of repetitive code. • There have been many instances of compilers being optimized to do well on popular benchmarks. • Real programs provide best indication of performance. • Should be chosen based on user needs. • Scientific applications have different requirements than large data base applications.
Spec95 Benchmarks • The System Performance Evaluation Cooperative (SPEC) group was formed in 1988 by representatives of many computer companies. • Most popular and comprehensive set of CPU benchmarks. • 8 integer and 10 floating-point programs (see Fig 2.6 page 72).
Amdahl's Law Execution Time After Improvement = Execution Time Unaffected +( Execution Time Affected / Amount of Improvement ) • Example: "Suppose a program runs in 100 seconds on a machine, with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?" How about making it 5 times faster? • Principle: Make the common case fast
Execution Time • Elapsed Time • counts everything (disk and memory accesses, I/O , etc.) • a useful number, but often not good for comparison purposes • CPU time • doesn't count I/O or time spent running other programs • can be broken up into system time, and user time • Our focus: user CPU time • time spent executing the lines of code that are "in" our program
For a given instruction set architecture increase in CPU performance can come form three sources • Increase in clock rate or reduction in clock cycles per instruction. • Better compilers • Improvements in processor architecture
Terms • Cycle time or clock cycle time. • If clock frequency, f = 400 MHz then cycle time T = 1/f = 2.5 ns. • CPI – cycles per instruction or clocks per instruction. • Different instructions may require different number of clock cycles to execute. • MIPS – million of instructions per second. • Varies depending on instruction stream. • Peak MIPS – Best case instruction stream. • Native MIPS – Typical instruction stream. • MIPS = (instruction count) / (execution time x 106) = average number of instructions executed in one micro sec.
An example • Assume we only need to consider CPU time. • Let clock rate = 400 MHz = 400 million cycles/sec. • Three types of instructions: A, B, and C. • Assume we run a program that executes 1000 million instructions.