1 / 57

Dezső Sima Fall 2007

Multicore Processors (5). Dezső Sima Fall 2007. (Ver. 2.1).  Dezső Sima, 2007. 10.3 IBM’s MC processors. 10.3.1 POWER line. 10.3.2 Cell BE. 10.3 IBM’s MC processors. 10.3.1 POWER line. POWER4. 180 nm. 10 /200 1. 130 nm. POWER4+. 11 /200 2. POWER5. 130 nm.

jamal-weeks
Download Presentation

Dezső Sima Fall 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multicore Processors (5) Dezső Sima Fall 2007 (Ver. 2.1)  Dezső Sima, 2007

  2. 10.3 IBM’s MC processors • 10.3.1 POWER line • 10.3.2 Cell BE

  3. 10.3 IBM’s MC processors 10.3.1 POWER line • POWER4 180 nm 10/2001 130 nm • POWER4+ 11/2002 • POWER5 130 nm 5/2004 90 nm • POWER5+ 10/2005 65 nm • POWER6 5/2007

  4. 10.3.1 Evolution of IBM’s major RISC lines Figure: The evolution of IBM’s major RISC lines

  5. 10.3.1 POWER4 (1) Service Processor Core interface Unit (crossbar) Power On Reset Built-In-SelfTest Non-Cacheable Unit MultiChip Module Figure : POWER4 chip logical view [3.6]

  6. 10.3.1 POWER4 (2) Figure: Logical view of the L3 controller [3.5]

  7. 10.3.1 POWER4 (3) Figure: The memory cotroller of the POWER4 [3.5]

  8. 10.3.1 POWER4 (4) Fabric Controller Figure: I/O controller of the POWER4 [3.5]

  9. 10.3.1 POWER4 (5) Figure: POWER4 chip [3.11]

  10. 10.3.1 POWER4 (6) POWER line POWER4 Dual/Quad-Core DC Introduced 10/2001 Technology 180 nm Die size 412 mm2 Nr. of transistors 174 mtrs fc [GHz] 1.3 L2 Size/allocation 1.44 MB/shared Implementation On-chip L3 Size 32 MB Implementation Tags on-chip, data off-chip Mem. contr. Off-chip TDP [W] 115/125 Packaging SCM1/MCM2 Dual threaded Power management L3 impl. Tags on-chip L3 size 32 MB 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management Table: Main features of IBM’s dual-core POWER line

  11. 10.3.2 POWER4+ (1) Figure: New features of the POWER5+ [3.3]

  12. 10.3.1 POWER4+ (2) POWER line POWER4 POWER4+ Dual/Quad-Core DC DC Introduced 10/2001 11/2002 Technology 180 nm 130 nm Die size 412 mm2 380 mm2 Nr. of transistors 174 mtrs 184 mtrs fc [GHz] 1.3 1.7 L2 Size/allocation 1.44 MB/shared 1.5 MB/shared Implementation On-chip On-chip L3 Size 32 MB 32 MB Implementation Tags on-chip, data off-chip Mem. contr. Off-chip On-chip TDP [W] 115/125 70 Packaging SCM1/MCM2 SCM1/MCM2 Dual threaded Power management L3 impl. Tags on-chip L3 size 32 MB 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management Table: Main features of IBM’s dual-core POWER line

  13. 10.3.1 POWER5 (1) (Exclusive L3) Figure 5.14: Contrasting POWER4 and POWER5 system structures [3.1]

  14. 10.3.1 POWER5 (2) Figure: Block diagram of the POWER5 (1) [3.1]

  15. 10.3.1 POWER5 (3) Figure: Block diagram of the POWER5 (2) [3.12]

  16. 10.3.1 POWER5 (4) Figure: Floorplan of the POWER5 [3.13]

  17. 10.3.1 POWER5 (6) POWER4 POWER5 180 nm, 412 mm2 130 nm, 389 mm2 (~3 % enlarged) Figure: Contrasting the floor plans of the POWER4 and POWER5 dies [3.11], [3.13]

  18. 10.3.1 POWER5 (7) POWER5+ Dual-Core Module Figure: Packaging alternatives of the POWER4/5 processors Source: Partridge R. and Ghatpande S., IBM Introduces POWER5+ and Quad-Core Modules in System p5,” Tech Trends Monthly, Nov./Dec. 2005,

  19. 10.3.1 POWER5 (8) Figure: Quad–Chip POWER4 module (MCM) and a 32-way POWER4 system [3.7]

  20. 10.3.1 POWER5 (10) Figure: Photos of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER5 [3.7]

  21. 10.3.1 POWER5 (11) Figure: The Multi-chip module of the POWER5 [3.10]

  22. 10.3.1 POWER5 (12) POWER line POWER4 POWER4+ POWER5 Dual/Quad-Core DC DC DC Introduced 10/2001 11/2002 5/2004 Technology 180 nm 130 nm 130 nm Die size 412 mm2 380 mm2 389 mm2 Nr. of transistors 174 mtrs 184 mtrs 276 mtrs fc [GHz] 1.3 1.7 1.65/1.9 L2 Size/allocation 1.44 MB/shared 1.5 MB/shared 1.9 MB/shared Implementation On-chip On-chip On-chip L3 Size 32 MB 32 MB 36 MB Implementation Tags on-chip, data off-chip Mem. contr. Off-chip On-chip On-chip TDP [W] 115/125 70 80 (est) Packaging SCM1/MCM2 SCM1/MCM2 DCM3/MCM2 Dual threaded Power management DPM6 L3 impl. Tags on-chip Tags on-chip L3 size 32 MB 36 MB 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management Table: Main features of IBM’s dual-core POWER line

  23. 10.3.1 POWER5+ (1) Figure: Block diagram of the POWER5+ Source: Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, http://www.redbooks.ibm.com/redpapers/pdfs/redp4150.pdf

  24. 10.3.1 POWER5 (9) Figure.: Interpretation of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER5 [3.7]

  25. 10.3.1 POWER5+ (2) Figure: Dual-Core Modules (DCMs) and Quad-Core Modules (QCM) of the POWER5+ [3.14]

  26. 10.3.1 POWER5+ (3) POWER line POWER4 POWER4+ POWER5 POWER5+ Dual/Quad-Core DC DC DC DC Introduced 10/2001 11/2002 5/2004 10/2005 Technology 180 nm 130 nm 130 nm 90 nm Die size 412 mm2 380 mm2 389 mm2 230 mm2 Nr. of transistors 174 mtrs 184 mtrs 276 mtrs 276 mtrs fc [GHz] 1.3 1.7 1.65/1.9 1.92 L2 Size/allocation 1.44 MB/shared 1.5 MB/shared 1.9 MB/shared 1.9 MB/shared Implementation On-chip On-chip On-chip On-chip L3 Size 32 MB 32 MB 36 MB 36 MB Implementation Tags on-chip, data off-chip Mem. contr. Off-chip On-chip On-chip On-chip TDP [W] 115/125 70 80 (est) 70 Packaging SCM1/MCM2 SCM1/MCM2 DCM3/MCM2 DCM4/QCM5 Dual threaded Power management DPM6 DPM6 L3 impl. Tags on-chip Tags on-chip Tags on-chip L3 size 32 MB 36 MB 36 MB 10.3 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management Table: Main features of IBM’s dual-core POWER line

  27. 10.3.1 POWER6 (1) POWER6’s main features [3.15b] • ultra-high frequency (4.7 = GHz) dual core dual threaded SMT • 13 FO4 design • private 4 MB L2 caches • partially integrated 32 MB L3 victim cache • minimization of excessive circuitry to reduce dissipation • (modest speculation and ooo-execution, no renaming) • push many fuctions of decoding and instruction grouping into predecoding (4 stages) • (added L2 latency causes 0.5 % loss for each stage whereas each added stage after • the I-cache access results in about 1 % loss per stage) • increased dispath and completion bandwidth (to 7 instructions per thread) • L2 cache, SMP interconnect, parts of the memory and I/O subsystem operate at 0.5 fc, • L3 operates at one-quarter, the memory. controller up to 3.2 GHz. • (In the POWER5 the L2 operates at fc,the remaining components at 0.5 fc.) • since L2 operates at 0.5 fc, the width of the load and store interfaces was doubled.

  28. 10.3.1 POWER6 (2) POWER6 (in the IBM System p570) had at intro the highest figures for SPECint2006, SPECfp2006, SPECjbb2005 (Java performance) and TPC-C (transaction performance).

  29. 10.3.1 POWER6 (3) POWER6 POWER5+ Hardware support of decimal arithmetic Figure: Contrasting the block diagrams of the POWER5 and POWER6 processors [3.15a]

  30. 10.3.1 POWER6 (4) Figure: Comparing the POWER5 and POWER6 processors [3.15b]

  31. 10.3.1 POWER6 (5) Table: Throughput comparison POWER6 vs POWER5 [3.15b]

  32. 10.3.1 POWER6 (6) [3.15b]

  33. 10.3.1 POWER6 (7) Figure: The internal pipelines of the POWER6 and the POWER5 [3.15b]

  34. 10.3.1 POWER6 (8) Figure: First level nodal topology of the POWER6 vs POWER5 [3.15b]

  35. 10.3.1 POWER6 (9) Figure: Second level topology of the POWER5 vs POWER6 [3.15b]

  36. 10.3.1 POWER6 (10) Table: POWER6 processor functional signal I/O-pin comparison for various system types [3.15b]

  37. 10.3.1 POWER6 (11) Figure: Micrograph of the POWER6 [3.15b]

  38. 10.3.1 POWER6 (12) POWER line POWER4 POWER4+ POWER5 POWER5+ POWER6 Dual/Quad-Core DC DC DC DC DC Introduced 10/2001 11/2002 5/2004 10/2005 5/2007 Technology 180 nm 130 nm 130 nm 90 nm 65 nm Die size 412 mm2 380 mm2 389 mm2 230 mm2 341 mm2 Nr. of transistors 174 mtrs 184 mtrs 276 mtrs 276 mtrs 790 mtrs fc [GHz] 1.3 1.7 1.65/1.9 1.92 4.7 L2 Size/allocation 1.44 MB/shared 1.5 MB/shared 1.9 MB/shared 1.9 MB/shared 2*4 MB/private Implementation On-chip On-chip On-chip On-chip On-chip L3 Size 32 MB 32 MB 36 MB 36 MB 32 MB Implementation Tags on-chip, data off-chip Mem. contr. Off-chip On-chip On-chip On-chip TDP [W] 115/125 70 80 (est) 70 ~100 Packaging SCM1/MCM2 SCM1/MCM2 DCM3/MCM2 DCM4/QCM5 n.a. Dual threaded Power management DPM6 DPM6 n.a. L3 impl. Tags on-chip Tags on-chip Tags on-chip Tags on-chip On-chip 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management Table: Main features of IBM’s dual-core POWER line

  39. 10.3 IBM’s MC processors 10.3.2 Cell BE • Cell BE 90 nm 2/2006

  40. 10.3.2 Cell BE (1) Figure: The history and development cost of the Cell BE [3.17], [3.22]

  41. 10.3.2 Cell BE (2) AUC: Atomic Update Cache BIC: Bus Interface Contr. EIB: Element Interface Bus LS: Local Store of 256 KB MFC: Memory Flow Controller MIC: Memory Interface Contr. PPE: Power Processing Element PXU: POWER Execution Unit SMF: Synergistic Memory Flow Unit SPU: Synergistic Processor Unit SXU: Synergistic Execution Unit XDR: Rambus DRAM Figure: Block diagram of the Cell BE [3.19]

  42. 10.3.2 Cell BE (3) Design parameters of the Cell BE: PPE: dual-threaded > 200 GFLOPS (SP) > 20 GFLOPS (DP) > 25 GB/s memory BW > 75 GB/s I/O BW > 300 GB/s EIB BW fc > 4 GHz (lab) Figure: Main design parameters of the Cell BE [3.28]

  43. 10.3.2 Cell BE (4) Figure : Cell SPE architecture [3.16]

  44. 10.3.2 Cell BE (5) Figure: Block diagram of the SPE [3.19]

  45. 10.3.2 Cell BE (6) Figure: Pipeline stages of the Cell BE [3.19]

  46. 10.3.2 Cell BE (7) Figure: Floor plan of a single SPE [3.19]

  47. 10.3.2 Cell BE (8) Principle of operation of the Element Interface Bus (EIB) [3.23]

  48. 10.3.2 Cell BE (9) Figure: The Element Interface Bus EIB) [3.19]

  49. 10.3.2 Cell BE (10) Figure: The Synergistic Memory Flow unit (SMF) [3.19]

  50. Figure: PPE block diagram [3.28]

More Related