1 / 21

Performance and Quantitative Principles

Vincent H. Berk September 26 th , 2008 Reading for today: Chapter 1.1 - 1.4, Amdahl article Reading for Monday: Chapter 1.5 – 1.11, Mazor article Homework for Wednesday: 1.1, 1.3, 1.6, 1.7, 1.13. Performance and Quantitative Principles. Review. Task of Computer Designers

najwa
Download Presentation

Performance and Quantitative Principles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ENGS 116 Lecture 2 Vincent H. Berk September 26th, 2008 Reading for today: Chapter 1.1 - 1.4, Amdahl article Reading for Monday: Chapter 1.5 – 1.11, Mazor article Homework for Wednesday: 1.1, 1.3, 1.6, 1.7, 1.13 Performance and Quantitative Principles

  2. ENGS 116 Lecture 2 Review • Task of Computer Designers • Determine which attributes are important for a new machine • Design a machine to maximize performance without violating cost/power/functionality constraints • 3 Components of “Architecture” • Instruction set design • Organization • Hardware

  3. ENGS 116 Lecture 2 Benchmarking Games • Different configurations used to run the same workload on two systems. • Compiler customized to optimize the workload. • Workload arbitrarily picked to skew results. • Test specification written to be biased toward one machine.

  4. ENGS 116 Lecture 2 Design benchmarks for: • Industrial and design • Consumer Electronics • Networking, routers • Office applications • Telecommunications • Weapon systems

  5. ENGS 116 Lecture 2 Execution time • Weighted arithmetic mean: sum over execution time of all programs run, times their relative frequencies • Normalized execution time: take a reference machine, set it to 1, then compute normalized execution times for others based on this machine • Geometric mean of normalized execution time (reference computer becomes irrelevant, ratios can arbitrarily be compared)

  6. ENGS 116 Lecture 2 Amdahl’s Law Execution time after improvement = Make the common case fast

  7. ENGS 116 Lecture 2 Speedup due to enhancement E: Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected: ExTime (E) = Speedup (E) = Amdahl’s Law ExTime w/o E Performance w/ E  Speedup(E) = ExTime w/ E Performance w/o E

  8. ENGS 116 Lecture 2 Amdahl’s Law

  9. ENGS 116 Lecture 2 Example: Floating point instructions improved to run 2X, but only 10% of actual instructions are FP Amdahl’s Law

  10. ENGS 116 Lecture 2 All instructions require an instruction fetch, only a fraction require a data fetch/store. Optimize instruction access over data access Programs exhibit locality. Spatial Locality Temporal Locality Access to small memories is faster. Provide a storage hierarchy such that the most frequent accesses are to the smallest (closest) memories. Disk/Tape Registers Cache Memory Corollary: Make The Common Case Fast

  11. ENGS 116 Lecture 2 Metrics of Performance Application Answers per month Operations per second Programming Language Compiler Millions of instructions per second: MIPS Millions of FP operations per second: MFLOPS ISA Datapath Megabytes per second Control Function Units Cycles per second (clock rate)‏ Transistors Wires Pins

  12. ENGS 116 Lecture 2 Machines with different instruction sets? Programs with different instruction mixes? Dynamic frequency of instructions Uncorrelated with performance Marketing Metrics • Machine dependent • Often not where time is spent

  13. ENGS 116 Lecture 2 Instr. Count CPI Clock Rate Program Compiler Instruction Set Organization Technology Aspects of CPU Performance

  14. ENGS 116 Lecture 2 Instr. Count CPI Clock Rate Program X Compiler X (X) Instruction Set X X Organization X X Technology X Aspects of CPU Performance

  15. ENGS 116 Lecture 2 Average Cycles per Instruction CPI = (CPU Time  Clock Rate) / Instruction Count = Cycles / Instruction Count CPU time = Cycle Time  Instruction Frequency Invest resources where time is spent! Cycles Per Instruction

  16. ENGS 116 Lecture 2 Base Machine (Reg / Reg)‏ Op Freq Cycles CPI (i) (% Time)‏ ALU 50% 1 .5 (33%)‏ Load 20% 2 .4 (27%)‏ Store 10% 2 .2 (13%)‏ Branch 20% 2 .4 (27%)‏ 1.5 Typical Mix Example: Calculating CPI

  17. ENGS 116 Lecture 2 Want to add register / memory operations - One source operand in memory - One source operand in register - Cycle count of 2 Side effect: Branch cycle count will increase to 3. What fraction of the loads must be eliminated for this to pay off? Base Machine (Reg / Reg)‏ Op Freq Cycles ALU 50% 1 Load 20% 2 Store 10% 2 Branch 20% 2 Example

  18. ENGS 116 Lecture 2 Exec Time = Instruction Count  CPI  Clock Op Freq Cycles CPI Freq Cycles CPI ALU .50 1 .5 Load .20 2 .4 Store .10 2 .2 Branch .20 2 .4 Reg/Mem 1.00 1.5 Example Solution

  19. ENGS 116 Lecture 2 Exec Time = Instruction Count  CPI  Clock Op Freq Cycles CPI Freq Cycles CPI ALU .50 1 .5 .5 – X 1 .5 – X Load .20 2 .4 .2 – X 2 .4 – 2X Store .10 2 .2 .1 2 .2 Branch .20 2 .4 .2 3 .6 Reg/Mem X 2 2X 1.00 1.5 1 – X (1.7 – X) /(1 – X)‏ CPINew must be normalized to new instruction frequency Example Solution

  20. ENGS 116 Lecture 2 Exec Time = Instruction Count  CPI  Clock Op Freq Cycles CPI Freq Cycles CPI ALU .50 1 .5 .5 – X 1 .5 – X Load .20 2 .4 .2 – X 2 .4 – 2X Store .10 2 .2 .1 2 .2 Branch .20 2 .4 .2 3 .6 Reg/Mem X 2 2X 1.00 1.5 1 – X (1.7 – X) / (1 – X)‏ Instr CntOld CPIOld ClockOld = Instr CntNew CPINew ClockNew Example Solution

  21. ENGS 116 Lecture 2 Exec Time = Instruction Count  CPI  Clock Op Freq Cycles CPI Freq Cycles CPI ALU .50 1 .5 .5 – X 1 .5 – X Load .20 2 .4 .2 – X 2 .4 – 2X Store .10 2 .2 .1 2 .2 Branch .20 2 .4 .2 3 .6 Reg/Mem X 2 2X 1.00 1.5 1 – X (1.7 – X) / (1 – X)‏ Instr CntOld CPIOld ClockOld = Instr CntNew CPINew ClockNew 1.00  1.5 = (1 – X)  (1.7 – X) / (1 – X)‏ 1.5 = 1.7 – X 0.2 = X ALL loads must be eliminated for this to be a win! Example Solution

More Related