1 / 36

Review of Technology Trends and Cost/Performance

Review of Technology Trends and Cost/Performance. Ali Azarpeyvand Advanced Computer Architecture. Outline. Cost / Price IC cost Performance? Amdahl ’ s law CPI Benchmarks. Cost. Die cost = Wafer cost Dies per Wafer * Die yield.

clodia
Download Presentation

Review of Technology Trends and Cost/Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review of Technology Trends and Cost/Performance Ali Azarpeyvand Advanced Computer Architecture

  2. Outline • Cost / Price • IC cost • Performance? • Amdahl’s law • CPI • Benchmarks Advanced Computer Architecture

  3. Cost

  4. Die cost = Wafer cost Dies per Wafer * Die yield IC cost = Die cost + Testing cost + Packaging cost Final test yield Integrated Circuits Costs Die Cost goes roughly with die area5 Advanced Computer Architecture

  5. Wafer Advanced Computer Architecture

  6. Real World Examples Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost layers width cost /cm2 mm2 wafer 386DX 2 0.90 $900 1.0 43 360 71% $4 486DX2 3 0.80 $1200 1.0 81 181 54% $12 PowerPC 601 4 0.80 $1700 1.3 121 115 28% $53 HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73 DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149 SuperSPARC 3 0.70 $1700 1.6 256 48 13% $272 Pentium 3 0.80 $1500 1.5 296 40 9% $417 • From "Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15 Advanced Computer Architecture

  7. DRAM Prices (close to Costs) Advanced Computer Architecture

  8. Design for What? • For Performance • Supercomputer • For cost • Cellular phones • For cost / performance • Workstations • Now back to performance Advanced Computer Architecture

  9. The Bottom Line: Performance (and Cost) • "X is n times faster than Y" means • ExTime(Y) Performance(X) • --------- = --------------- • ExTime(X) Performance(Y) • Speed of Concorde vs. Boeing 747 • Throughput of Boeing 747 vs. Concorde Advanced Computer Architecture

  10. “Average Cycles per Instruction” • CPI = (CPU Time * Clock Rate) / Instruction Count • = Cycles / Instruction Count n CPU time = CycleTime * SCPI * I i i i = 1 “Instruction Frequency” n CPI = SCPI * F i i i = 1 Cycles Per Instruction Advanced Computer Architecture

  11. Base Machine (Reg / Reg) Op Freq Cycles CPI(i) (% Time) ALU 50% 1 .5 (33%) Load 20% 2 .4 (27%) Store 10% 2 .2 (13%) Branch 20% 2 .4 (27%) 1.5 Typical Mix Example: Calculating CPI Advanced Computer Architecture

  12. Measurement Tools • Benchmarks, Traces, Mixes • Hardware: Cost, delay, area, power estimation • Simulation (many levels) • ISA, RT, Gate, Circuit • Rules of Thumb • Fundamental “Laws”/Principles Advanced Computer Architecture

  13. Applications for Measuring Performance • Real applications • gcc, MS Word, photoshop • Modified (or scripted) applications • Enhance portability • Emphasize the required criteria (like using scripts instead of IO when CPU power is considered) • Kernels • small, key pieces from real programs • Toy benchmarks • No particular use, just a code like quicksort, … • Synthetic benchmarks Advanced Computer Architecture

  14. Performance: What to measure • Usually rely on benchmarks vs. real workloads • To increase predictability, collections of benchmark applications-- benchmark suites -- are popular • SPECCPU: popular desktop benchmark suite • CPU only, split between integer and floating point programs • SPECint2000 has 12 integer, SPECfp2000 has 14 integer pgms • SPECCPU2006 (12 Integer, 17 FP) • SPECSFS (NFS file server) and SPECWeb (WebServer) added as server benchmarks • Embedded (EEMBC) • 34 Kernels • Transaction Processing Council measures server performance and cost-performance for databases • TPC-C Complex query for Online Transaction Processing • TPC-H models ad hoc decision support • TPC-W a transactional web benchmark • TPC-App application server and web services benchmark Advanced Computer Architecture

  15. SPEC: System Performance Evaluation Cooperative • First Round 1989 • 10 programs yielding a single number (“SPECmarks”) • Second Round 1992 • SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs) • Third Round 1995 • new set of programs: SPECint95 (8 integer programs) and SPECfp95 (10 floating point) • “benchmarks useful for 3 years” • SPEC CPU 2000 • SPEC CPU 2006 Advanced Computer Architecture

  16. SPEC CPU2000 Advanced Computer Architecture

  17. CINT 2006 400.perlbench C PERL Programming Lang 401.bzip2 C Compression 403.Gcc C C Compiler 429.Mcf C Combinatorial Optimization 445.Gobmk C Artificial Intelligence: go 456.Hmmer C Search Gene Sequence 458.Sjeng C Artificial Intelligence: chess 462.Libquantum C Physics: Quantum Computing 464.h264ref C Video Compression 471.Omnetpp C++ Discrete Event Simulation 473.Astar C++ Path-finding Algorithms 483.Xalancbmk C++ XML Processing Advanced Computer Architecture

  18. CFP 2006 • 410.Bwaves Fortran Fluid Dynamics • 416.Gamess Fortran Quantum Chemistry • 433.Milc C Physics: Quantum Chromodynamics • 434.Zeusmp Fortran Physics/CFD • 435.Gromacs C/Fortran Biochemistry/Molecular Dynamics • 436.cactusADM C/Fortran Physics/General Relativity • 437.leslie3d Fortran Fluid Dynamics • 444.Namd C++ Biology/Molecular Dynamics • 447.dealII C++ Finite Element Analysis • 450.Soplex C++ Linear Programming, Optimization • 453.Povray C++ Image Ray-tracing • 454.Calculix C/Fortran Structural Mechanics • 459.GemsFDTD Fortran Computational Electromagnetics • 465.Tonto Fortran Quantum Chemistry • 470.Lbm C Fluid Dynamics • 481.Wrf C/Fortran Weather Prediction • 482.sphinx3 C Speech recognition Advanced Computer Architecture

  19. Summarizing Performance

  20. Means Advanced Computer Architecture

  21. Weighted Means Advanced Computer Architecture

  22. Relations among Means Equality holds if and only if all the elements are identical. Advanced Computer Architecture

  23. System Rate (Task 1) Rate (Task 2) A 10 20 B 20 10 Summarizing Performance Which system is faster? Advanced Computer Architecture

  24. Average Average Average System System System Rate (Task 1) Rate (Task 1) Rate (Task 1) Rate (Task 2) Rate (Task 2) Rate (Task 2) 1.00 1.25 15 A A A 0.50 10 1.00 2.00 1.00 20 1.00 1.25 15 B B B 2.00 20 1.00 1.00 0.50 10 … depends who’s selling Average throughput Throughput relative to B Throughput relative to A Advanced Computer Architecture

  25. Power and Energy • Energy to complete operation (Joules) • Corresponds approximately to battery life • (Battery energy capacity actually depends on rate of discharge) • Peak power dissipation (Watts = Joules/second) • Affects packaging (power and ground pins, thermal design) • di/dt, peak change in supply current (Amps/second) • Affects power supply noise (power and ground pins, decoupling capacitors) Advanced Computer Architecture

  26. Peak Power versus Lower Energy • System A has higher peak power, but lower total energy • System B has lower peak power, but higher total energy Peak A Peak B Power Integrate power curve to get energy Time Advanced Computer Architecture

  27. Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = ------------- = ------------------- ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected Advanced Computer Architecture

  28. ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced Amdahl’s Law Advanced Computer Architecture

  29. Amdahl’s Law • Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew= Speedupoverall = Advanced Computer Architecture

  30. 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced Amdahl’s Law • Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew= ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95 Advanced Computer Architecture

  31. Reg's Cache Disk / Tape Memory “Make The Common Case Fast” • All instructions require an instruction fetch, only a fraction require a data fetch/store • Optimize instruction access over data access • Programs exhibit locality • Spatial Locality • items with addresses near one another tend to be referenced close together in time • Temporal Locality • recently accessed items are likely to be accessed in the near future • Access to small memories is faster • Provide a storage hierarchy such that the most frequent accesses are to the smallest (closest) memories. Advanced Computer Architecture

  32. Metrics of Performance Application Answers per month Operations per second Programming Language Compiler (millions) of Instructions per second: MIPS (millions) of (FP) operations per second: MFLOP/s ISA Datapath Megabytes per second Control Function Units Cycles per second (clock rate) Transistors Wires Pins Advanced Computer Architecture

  33. Basics of Performance Advanced Computer Architecture

  34. Details of CPI Advanced Computer Architecture

  35. CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Aspects of CPU Performance Inst Count CPI Clock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X Advanced Computer Architecture

  36. Summary • Cost / Price • Integrated Circuits Costs • Measurments • SPEC: System Performance Evaluation Cooperative • Amdahl's Law: Make common case fast • Aspects of CPU Performance Advanced Computer Architecture

More Related