330 likes | 350 Views
Explore ISA, memory models, instruction formats, and registers for improved computing performance. Learn about CPU performance metrics like Clock Cycle and CPI for effective evaluation.
E N D
CSCI206 - Computer Organization & Programming Performance Revised by Alexander Fuchsberger and Xiannong Meng in spring 2019 based on the notes by other instructors. zyBook: 4.1, 4.2, 4.3, 4.5
ISA -- a few more words ISA (Instruction Set Architecture) is a set of protocols (conventions) that defines how a computing machine appears to a machine language programmer or compiler. The ISA has three components • Memory model (how to compute memory address …) • Instruction format, types and modes • Registers on which the instructions can operate We’ll discuss ISA in greater details in the entire semester.
At the end of the day, what matters is PERFORMANCE. (and cost, too, obviously!)
At the end of the day, what matters is PERFORMANCE. (and cost, too, obviously!) (and power consumption!)
PollEv.com/XIANNONGMENG758 Or Text XIANNONGMENG758 to 37607 Defining performance Which airplane has the best performance? A) B) $300M, low operating costs, 375 pass, 610 mph. $350M, med operating costs, 470 pass, 610 mph. C) D) $34M, high operating costs, 146 pass, 544 mph. $??M, very high operating costs, 132 pass, 1,350 mph.
Measuring computing performance the common computing metric used to measure performance is throughput how much work can be completed in a certain amount of time (e.g., one second) e.g., Bytes/sec, FLOPS, FPS
Performance Defined 1 / time is the throughput of the system if the instruction execution time is 0.5 seconds the throughput is 2 instructions per second Performance
Comparing performance X is n times faster than Y
X is n times faster than Y If X is in fact faster than Y, n > 1 If n < 1, X is slower than Y The faster machine is n times faster than the slower machine For n > 1, the execution time on the TOP has to be the larger number. This is the SLOWER computer (y)
PollEv.com/XIANNONGMENG758 Or Text XIANNONGMENG758 to 37607 Practice A task on blue takes 5 seconds. The same task on red takes 7.5 seconds. How much faster is blue than red?
Refining CPU performance Execution time is a good measure. It is broadly affected by the CPU architecture (ISA), operating system/ABI, and clock rate. How do we measure the performance of an ISA?
Clock Cycle The system clock synchronizes when the logic circuits change state within the CPU
Metric to quantify performance of an ISA implementation Cycles Per Instruction (CPI) A certain CPU takes 2,000 clock cycles to execute 1,000 machine language instructions,the CPI is 2.0. CPIThe average number of clock cycles per instruction for a program or program fragment.
CPI in more detail Early processors had CPI = 1 for all instructions. However, some operations are inherently more difficult (e.g., integer add vs. floating point divide). In modern processors the CPI will depend on the operation being performed.
Iron Law of CPU Performance decompose performance into 3 key parts: ISA implementation performance chip physical performance program (workload) specific
Performance question What is the execution time for a program with 5 B instructions and a clock rate of 2.5 GHz?
Performance question What is the execution time for a program with 5 B instructions and a clock rate of 2.5 GHz and CPI = 1?
Performance question What is the execution time for a program with 5 B instructions and a clock rate of 2.5 GHz and CPI = 2?
Average CPI On a given program the CPI will vary for different classes of instruction, we can compute the average CPI from the specific instruction mix.
Example A CPU has 3 instruction classes (A, B, C) the compiler chooses which instructions to use in a program. With the default compiler, program P1 is generated using 5 instructions. An optimizing compiler is used and it avoids instructions with high CPI to improve performance but needs 6 instructions in P2. What is the average CPI in each case? time time
Evaluating performance Instruction mix has a big impact on performance To compare various CPUs (fairly!) we use a variety of programs that consists of different instruction mixes (that represent different real workloads) These programs are called benchmarks
Pitfall “Expecting the improvement of one aspect of a computer to increase overall performance by an amount proportional to the size of the improvement.”
Example A cyclist improves their aerodynamic performance by a factor of 3. This yields a performance improvement that is about 1.05. (not 3!) Why? Aerodynamics contribute only a small part to the overall performance (speed). Traditional setup 3 x more aerodynamic
Computing Example Prof. Xavier got a new quad-core, 2.6GHz computer to replace his 5 year-old single core, 2.4GHz computer. When he runs applications on the new computer (such as word processor, photo browser, media player, etc.), he notes almost no improvement but he expected a 4x speedup.
Amdahl’s Law Example: a certain application takes 10 s to load, of this 10% can be executed in parallel. The new execution time on a quad core computer is 9.25 s.
Pitfall Using instructions per second or clock rate as a measure of performance. In modern CPUs CPI varies widely Instruction mix and CPI must be taken into account!