Runnemede : Disruptive Technologies for UHPC

Runnemede:Disruptive Technologiesfor UHPC John Gustafson Intel Labs HPC User Forum – Houston 2011

The battle lines are drawn… “We’re going to try to make the entire exascale machine cache-coherent.” —Bill Dally, Nvidia “Caches are for morons.” —Shekhar Borkar, Intel

Intel’s UHPC Approach • Design test chips with the idea of maximizing learning. • Very different from producing product roadmap processor designs. • Going from Peta to Exa is nothing like the last few 1000x increases…

Building with Today’s Technology TFLOP Machine today Decode and control Translations …etc Power supply losses Cooling…etc 4450W 10TB disk @ 1TB/disk @10W 5KW 100W Disk 100pJ com per FLOP 100W Com 0.1B/FLOP @ 1.5nJ per Byte 150W Memory 200W 200pJ per FLOP Compute KW Tera, MW Peta, GW Exa?

The Power & Energy Challenge TFLOP Machine today 4550W TFLOP Machine then With Exa Technology 5KW 100W Disk 100W Com 5W ~3W 150W ~20W Memory ~5W 2W 200W Compute 5W

Scaling Assumptions 65 nm Core + Local Memory 8 nm Core + Local Memory DP FP Add, Multiply Integer Core, RF Router 5mm2 (50%) DP FP Add, Multiply Integer Core, RF Router 0.17mm2 (50%) Memory 0.35MB 0.17mm2 (50%) Memory 0.35MB 5mm2 (50%) ~0.6mm 0.34 mm2, 4.6 GHz, 9.2 GF, 0.24 to 0.46 W 10 mm2, 3 GHz, 6 GF, 1.8 W

1 450 10 65nm CMOS, 50°C 65nm CMOS, 50°C 400 350 300 1 250 Energy Efficiency (GOPS/Watt) Active Leakage Power (mW) 200 9.6X Subthreshold Region -1 150 10 100 50 320mV 320mV -2 0 10 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Supply Voltage (V) Near Threshold Logic H. Kaul et al, 16.6: ISSCC08

Traditional DRAM New DRAM architecture RAS Addr Page Page Page Page Page Page CAS Addr Activates many pages Lots of reads and writes (refresh) Small amount of read data is used Requires small number of pins Activates few pages Read and write (refresh) what is needed All read data is used Requires large number of I/Os(3D) Revise DRAM Architecture Energy cost today: ~150 pJ/bit

Data Locality Chip to memory Communication: ~1.5 nJ per Byte ~150 pJper Byte Core-to-core Communication on the chip: ~10 pJper Byte Chip to chip Communication: ~100 pJper Byte Data movement is expensive—keep it local (1) Core to core, (2) Chip-to-chip, (3) Memory

Disruptive Approach to Faults • We tend to assume that execution faults (soft errors, hard errors) are rare. And it’s a valid speculation. Currently. • Soon, we will need much more paranoia in hardware designs.

Road to Unreliability? Resiliency will be the cornerstone

Resiliency Minimal overhead for resiliency Error detection Fault isolation Fault confinement Reconfiguration Recovery & Adapt Applications System Software Programming system Microcode, Platform Microarchitecture Circuit & Design

Execution Model and Codelets Sea of Codelets Programming Models/Systems (Rich) • Codelet - Code that can be executed non-preemptively with an “event-driven” model • Shared memory model based on LC (Location Consistency – a generalized single-assignment model [GaoSarkar1980]) Run Time System Cores Hardware Abstraction Advanced Hardware Monitoring Net Peripherals/Devices

Summary • Voltage scaling to reduce power and energy • Explodes parallelism • Cost of communication vs computation—critical balance • Resiliency to combat side-effects and unreliability • Programming system for extreme parallelism • Application driven, HW/SW co-design approach • Self-awareness & execution model to harmonize

Runnemede : Disruptive Technologies for UHPC

Runnemede : Disruptive Technologies for UHPC

Presentation Transcript

Infosys Technologies Limited Automotive Industry

Constructive use of disruptive technologies

Tom Peters’ Re-Imagine! Business Excellence in a Disruptive Age SHORT.TourD’Horizon.14February.2005

A Tutorial on: Assisted Living Technologies for Older Adults

Tom Peters’ Re-Imagine! Business Excellence in a Disruptive Age Moscow/16February2004

The Technology Component: Help Desk Tools and Technologies

Artificial Intelligence Technologies for Web Intelligence

Dave Wibberley CEO Adroit Technologies

130nm and 90nm ASIC Technologies for SLHC applications at CERN

Tom Peters’ Re-Imagine! Business Excellence in a Disruptive Age DePuySpine/Houston/29January2005

Implementing Firewall Technologies

BEYOND-THE-HORIZON Anticipating Future and Emerging Information Society Technologies

Slides at … tompeters

Treatment Facility Technologies and Costs for Phosphorus Removal

IP Transmission Technologies

Class Slides Set 21 Tools and Technologies I

Tom Peters’ Re-Imagine! Business Excellence in a Disruptive Age South Africa/18August2003

Lecture 2 Web Technologies Part 1

[Storage]

Tom Peters’ Re-Imagine! Enterprise Excellence in a Disruptive Age ACCED-I/Orlando/04.13.2003

Tom Peters’ Re-Imagine! Business Excellence in a Disruptive Age FBR/Boca Raton/13November2004