20 likes | 136 Views
SNIC/KTH Proposal. Objective Improved energy efficiency over common IA32 based nodes by a factor of at least 5 High compute density, possibly 10 times that of an IA32 based system in SP Modest increase in programming complexity High volume component technologies Technology
E N D
SNIC/KTH Proposal • Objective • Improved energy efficiency over common IA32 based nodes by a factor of at least 5 • High compute density, possibly 10 times that of an IA32 based system in SP • Modest increase in programming complexity • High volume component technologies • Technology • Embedded processor technology • ARM Cortex 9 4-core CPU (0.8 – 2GHz, 0.4 – 2W) • TI DSP (designed in Nice) • Hybrid programming OpenMP+MPI • Industry Partners (tentative) • TI (4th largest IC company by revenue in 2009 after Intel, Samsung and Toshiba) • Supermicro • Smooth Stone (start-up with funding from ARM) targeting energy efficient servers for Internet and web applications. CPU chip by TI.
High Performance Compute Node • HPC Compute Node Performance • 1024 GMAC (1 TMAC) (MAC = Multiply-Accumulate, 32-bit) • 512 Single Precision Floating Point Operations @ 1Ghz (=614.4 GF SP@1.2 GHz) • Support for double precision floating point announced. • Approximate 50 to 60 W • DDR3 (number of DIMMs not yet fixed) • Interconnect • DSP to DSP: SRIO • CPU to DSP: PCIe x2 Gen2 (5 GHz) • Node to Node: 10 G Ethernet • Board • 4 – 8 nodes • 2,5 - 5 TF SP per board/blade • 3.5 – 7 TF/U SP • ~ 400 – 800 W/U • Programming Model • MPI across compute nodes • OpenMP within a node • DSPs and ARM processor both programmed in a high level language • OpenMP-style directives define accelerate regions that are executed on the DSPs HPC Compute Node Texas Instruments 8 core DSP @ 1.2 GHz Texas Instruments 8 core DSP @ 1.2 GHz Acceleration Memory DDR3-1333 PCIe/SRIO/Eth Connectivity CPU ARM/x86 Texas Instruments 8 core DSP @ 1.2 GHz CPU External Memory Texas Instruments 8 core DSP @ 1.2 GHz 10G Ethernet