1 / 21

Exascale Evolution

Exascale Evolution. Brad Benton, IBM March 15, 2010. Agenda. Exascale Challenges On the Path to Exascale: A Look at Blue Waters. www.openfabrics.org. 2. www.openfabrics.org. Exascale Challenges. 3. Exascale Challenges. www.openfabrics.org. Challenges at every level of system design

Download Presentation

Exascale Evolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exascale Evolution Brad Benton, IBM March 15, 2010 www.openfabrics.org

  2. Agenda • Exascale Challenges • On the Path to Exascale:A Look at Blue Waters www.openfabrics.org 2

  3. www.openfabrics.org Exascale Challenges 3

  4. Exascale Challenges www.openfabrics.org • Challenges at every level of system design • Managing 500M to 1B (most likely heterogeneous) cores • Programming models to exploit multi-core + accelerators • Interconnect • How will IB/RC scale to exascale? • How do we “get off the bus”? • How can we put more capability in the interconnect • Power Management • Power vs. Performance tradeoffs 4

  5. Exascale Challenges www.openfabrics.org • Challenges at every level of system design • Resilience/Fault-Tolerance • At this scale, something always be broken or in the process of breaking • Development Environment/Performance Tuning • Workflow Management/Process Steering • Data Management/Storage/Visualization 5

  6. Exascale Challenges • Resiliency/Fault-Tolerance • F/T Model • Fault Detection • Fault Isolation • Fault Containment • Fault Recovery • Re-integration • Software Resiliency • More than just checkpoint/restart • Containers/virtualization • suspend/migrate/resume www.openfabrics.org 6

  7. Programming Models • MPI • Will it survive in an exascale world? (its demise was predicted at petascale, but seems to be doing okay) • Evolve hybrid language models: MPI + “What?” • OpenMP • GPU Accelerators (CUDA, OpenCL) • PGAS languages • Greater Exploitation of Autotuningi.e., programs that write progams • ATLAS • FFTW • IBM HPC Toolkit has some of this www.openfabrics.org 7

  8. Title goes here on one line. www.openfabrics.org On the Path to Exascale: A look at Blue Waters 8

  9. NCSA Blue Waters • Joint effort between NCSA and University of Illinoishttp://www.ncsa.illinois.edu/BlueWaters/ • First Deliverable of a system based on PERCS technology (2011) • Will be the world’s first sustained petascale system for open scientific research • http://www.ncsa.illinois.edu/BlueWaters/pdfs/snir-power7.pdf for more detailed information www.openfabrics.org 9

  10. Blue Waters Overview www.openfabrics.org Approximately 10 PF/s peak More than 300,000 cores (homogeneous) More than 1 PetaByte memory More than 10 Petabyte disk storage More than 0.5 Exabyte archival storage More than 1 PF/s sustained on scientific applications 10 10

  11. Building Blue Waters Blue Waters will be the most powerful computer in the world for scientific research when it comes on line in Summer of 2011. Blue Waters ~1 PF sustained >300,000 cores >1 PB of memory >10 PB of disk storage ~500 PB of archival storage >100 Gbps connectivity Blue Waters Building Block 32 IH server nodes 32 TB memory 256 TF (peak) 4 Storage systems 10 Tape drive connections IH Server Node 8 MCM’s (256 cores) 1 TB memory 8 TF (peak) Fully water cooled Multi-chip Module 4 Power7 chips 128 GB memory 512 GB/s memory bandwidth 1 TF (peak) Router 1,128 GB/s bandwidth Blue Waters is built from components that can also be used to build systems with a wide range of capabilities—from deskside to beyond Blue Waters. Power7 Chip 8 cores, 32 threads L1, L2, L3 cache (32 MB) Up to 256 GF (peak) 45 nm technology CI Days • 22 February 2010 • University of Kentucky

  12. Power7 Chip: Computational Heart of Blue Waters Power7 Chip Quad-chip MCM www.openfabrics.org • Base Technology • 45 nm, 576 mm2 • 1.2 B transistors • Chip • 8 cores • 12 execution units/core • 1, 2, 4 way SMT/core • Up to 4 FMAs/cycle • Caches • 32 KB I, D-cache, 256 KB L2/core • 32 MB L3 (private/shared) • Dual DDR3 memory controllers • 128 GB/s peak memory bandwidth (1/2 byte/flop) • Clock range of 3.5 – 4 GHz 12

  13. High-End Server Resilience 13

  14. Feeds and Speeds per MCM 32 cores 8 Flop/cycle per core 4 threads per core max 3.5 – 4 GHz 1 TF/s 32 MB L3 512 GB/s memory BW (0.5 Byte/flop) 800 W (0.8 W/flop) 14

  15. One Drawer 8 MCMs, 32 chips, 256 cores • First Level Interconnect • L-Local • HUB to HUB Copper Wiring • 256 Cores www.openfabrics.org 15

  16. Interconnect: 1.1 TB/s HUB www.openfabrics.org 192 GB/s Host Connection 336 GB/s to 7 other local nodes in the same drawer 240 GB/s to local-remote nodes in the same supernode (4 drawers) 320 GB/s to remote nodes 40 GB/s to general purpose I/O 16

  17. www.openfabrics.org 17

  18. One Supernode 4 drawers, 32 MCMs, 128 chips, 1024 cores • Second Level Interconnect • Optical ‘L-Remote’ Links from HUB • Construct Super Node (4 CECs) • 1,024 Cores • Super Node www.openfabrics.org 18

  19. BPA • 200 to 480Vac • 370 to 575Vdc • Redundant Power • Direct Site Power Feed • PDU Elimination • Rack • 990.6w x 1828.8d x 2108.2 • 39”w x 72”d x 83”h • ~2948kg (~6500lbs) • Storage Unit • 4U • 0-6 / Rack • Up To 384 SFF DASD / Unit • File System • Rack Components • Compute • Storage • Switch • 100% Cooling • PDU Eliminated • Input: 8 Water Lines, 4 Power Cords • Out: ~100TFLOPs / 24.6TB / 153.5TB • 192 PCI-e 16x / 12 PCI-e 8x • CECs • 2U • 1-12 CECs/Rack • 256 Cores • 128 SN DIMM Slots / CEC • 8,16, (32) GB DIMMs • 17 PCI-e Slots • Imbedded Switch • Redundant DCA • NW Fabric • Up to:3072 cores, 24.6TB • (49.2TB) • WCU • Facility Water Input • 100% Heat to Water • Redundant Cooling • CRAH Eliminated www.openfabrics.org 19

  20. How does this affect OFA? • Blue Waters can connect externally via PCIe devices (e.g., InfiniBand) as needed • Blue Waters interconnect • Is RDMA based • Is not InfiniBand (or iWARP or RoCEE) • Hardware support for Global Shared Memory • Pendulum is swinging back to proprietary interconnects (at least at IBM) • Is there a path to OFA compatibility? • how can/should OFA accept/support new/different RDMA interconnects? • how can/should IBM work w/OFA for embracing new interconnect technologies? www.openfabrics.org 20

  21. Exascale Evolution • Technical Evolution is not always in a straight line • Different technologies evolve at different times and rates • e.g., Blue Waters is not a direct descendent of RoadRunner/Cell, but rather of POWER/Federation/SP • To reach exascale levels will require the consolidation and continued evolution of multiple technologies www.openfabrics.org 21

More Related