170 likes | 249 Views
Configuring a Large-Scale GALS System. M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*, J.V Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain. SpiNNaker. Objectives High-performance Robust
E N D
Configuring a Large-Scale GALS System M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*, J.V Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain
SpiNNaker • Objectives • High-performance • Robust • Low-power
SpiNNaker CMP • System RAM • Boot ROM • MC Router • Sys. Controller • Ethernet • SDRAM • 20 Proc. Nodes
Processing Node • ARM968E-S • Comm. Ctlr. • Interrupt Ctlr. • DMA Ctlr. • Timer • TCM (100K)
Communication Network • MC Router • Packets • MC • P2P • NN • 1Gb/s inter-chip • 6Gb/s per Node • Six two-way inter-chip links *L.A. Plana et al.An On-Chip and Inter-Chip Communications Network for the Spinnaker Massively-Parallel Neural Net Simulator. In Proc. Second ACM/IEEE International Symposium on Networks-on-Chip (NoCS 2008), pages 215 – 216, 2008.
Performance • 64K CMPs • > 1m ARM968 • 256 tera IPS computing power • >8 TB memory • 6 Gb/s/Node Comm. NoC (spike channel) • 1 Gb/s System NoC (synaptic channel) • 109 neurons in real-time
Fault-tolerance • Redundancy • Fault-detection and Isolation • Fault-recovery • Min. single-point-of-failure • Run-time configuration • Run-time recovery • Run-time application loading
Low-power • Hardware • Asynchronous Communication • Low-power ARM968 • Software • Asynchronous Event-Driven Model
Standard Application Model • Sleepy processors • Event-driven application • No scheduler • No software threads • Only ISRs • Driven by Interrupts
Configuration Process-I POST • Min Boot-ROM code • POST+chip components initialization • Batch mode Load Boot code in TCM Select Monitor Proc. Configure Interrupts yes Monitor no Configure Chip Go to Sleep
Configuration Process-II Recovery • Event-driven Model • Real-time Configuration • Processors on Sleep Host System Comm. yes no Host Chip Packet Comm. Frame + Packet Comm. Assign (x, y) Assign (0, 0) Conf. Router Conf. Router Status to Host Chip Acc. Status to Host
Flood-fill Mechanism 1 Ethernet Connection • Event-driven model • Droplets of data block to origin chip(s) • A pipelined wave of data from origin(s) to other chips 2 Ethernet Connections animations from http://physics-animations.com/Physics/English/int_ref.htm#Wlb
Flood-fill Mechanism • Various Mechs. • Broadcast • 5 Chips fwd • 3 Chips fwd • 2 Chips fwd • Performance Vs robustness
Evaluation • SystemC system-level model • Cycle-accurate • Instruction accurate • 129706 cycles for configuration process-I