230 likes | 389 Views
Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks. Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani (U of Tokyo, Japan ) Hideharu Amano(Keio U/ NII, Japan). HPC networks (Infiniband, GbE) On/Off link activation method
E N D
Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan) Hiroki Matsutani (Uof Tokyo, Japan) Hideharu Amano(Keio U/ NII, Japan)
HPC networks (Infiniband, GbE) On/Off link activation method Reducing power consumption of HPC networks Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations Cycle-accurate network simulator Behavior of network during the path change Outline
Network of High-performance computing 60% 50% 40% 30% Number of Supercomputers on Top500 List Percentage on Top500 List 20% 10% 0%
Examples RoadRunner (LANL) BLUEGENE/L (LLNL) TACC (Univ Texas) Propietary 251,904 cores 5th on top500 IBA 212,992 processors 2nd on Top500 list 122,400 cores 1ston Top500 IBA Virginia Tech's X ABE (NCSA) ASCI-Q (LANL) IBA IBA 2,200 cores 280th on Top500 9,600 cores 23th on top500 Quadrics 8,192 cores 2008
HPC Networks 11 7 8 9 10 4 5 6 0 1 2 3 14 15 13 12 TREE 1 TREE 2 TREE 3 TREE 4 4paths • Small switches (24/48-port) provide the lowest cost per port • When 100,000 cores are connected, a large number of small switches are needed • drastically increasing the number of links • Unused and rarely-used links should be deactivated for power-aware HPCs Link aggr. using 3 links switch host
Power cons is almost constant regardless of traffic load # of activated ports dominates the power cons of switches Power cons of port is reduced down to ZERO by port-shutdown operation Power cons of HPC switches Unit:W GbE IB
HPC networks (Infiniband, GbE) On/Off link activation method Reducing power consumption of HPC networks Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations Cycle-accurate network simulator Behavior of network during the path change Outline
Overview of the on/off link method TREE 1 TREE 2 TREE 3 TREE 4 11 11 7 7 8 8 9 9 10 10 4 4 5 5 6 6 0 0 1 1 2 2 3 3 14 14 15 15 13 13 12 12 Switch ports consume 40-60% of the total power of a switch Network load is not always high (e.g. during computation time) switch host Traffic load becomes low (turning off a part of links) TREE 1 TREE 2 TREE 3 TREE 4
A runtime on/off link method Traffic monitoring No Low or high-load links appear TREE 1 TREE 2 TREE 3 TREE 4 11 7 8 9 10 4 5 6 0 1 2 3 14 15 13 12 Yes Selection of on/off links and paths Update of link status and paths Very crucial factor Eg: port monitor, IPTraf, pilot execution Low traffic load is detected Paths: Before & After the before path is deactivated How is NW stabilized during the path-update?
3 NW Reconfiguration 0 6 0 3 5 1 2 5 4 1 4 6 Rnew 2 Stabilizing network during the path updateNetwork Reconfiguration (deadlock avoidance) Switch Link Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may deadlock Rold=Routing Table before the update Rnew=Routing Table after the update
6 2 6 2 Network Reconfiguration 3 Reconfiguration 0 0 3 5 1 5 4 1 4 Rnew Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may cause deadlock Deadlock Old behind new New behind old
Existing NW reconf tech. on fault-tolerant networks Static reconfiguration Dynamic reconfiguration Traffic is stopped New routing is applied Traffic is resumed Traffic is not stopped Old and new routing coexist Difficulty to avoid deadlock High latencies STATIC RECONFIGURATION(ST) DOUBLE-SCHEME SIMPLE RECONFIGURATION
Current NW Reconfigurations • SR PDA: Simple Reconfiguration: Packet Dropping Aware[Lysne08,TC] • Tokens are sent before update of routing • Packets are sent after updating routing tables • SR LA: Simple Reconfiguration: Latency Aware[Lysne08,IEEE TC] • All new tables are distributed before using new one. • Latency due to the tokens is reduced. • DS: Double Scheme[Pinkston03,TPDS] • Requires 2 virtual channels. • One channel have to be drained • ST:Static Reconfiguration • Traffic injection is completely stopped
HPC Interconnects (Infiniband, GbE) On/Off link activation method Reducing power consumption of HPC networks Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations Cycle-accurate network simulator Behavior of network during the path change Outline
Simulation Environment • Switch model (InfiniBand) • Buffered input (1KB per VL) and output (1KB per VL) ports • Non-multiplexed crossbar with separate ports per VL • FIFO-based crossbar arbiter per output crossbar port • Round-robin arbiter per output port • 100 ns routing time • Link model • Link Speed = 2.5 Gbps (1X links) • Topologies • 2D mesh networks • Traffic model • Packet lengths are 58 bytes • Uniform • Full range of traffic, from low load to saturation
Evaluation Results We twice apply NW reconf. process to each execution: • Deactivating links, after decrease the traffic injection • Re-activating links, after increase the traffic injection We evaluated full range of initial traffic injection, (from low traffic-to near congestion)
Static Reconfiguration (ST) Traffic increases, a link is reactivated Traffic decreases, a link is deactivated (a) Low Traffic Load Latency is high Traffic load decreases Traffic load increases Latency is high (b) High Traffic Load At each on/off link operation, traffic is not stabilized in ST!!
SR-LA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-LA!!
SR-PDA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-PDA!!
Double Scheme (dynamic reocnfiguration) (a) Low Traffic Load Traffic load decreases Traffic load increases Latency is constant (b) High Traffic Load Latency is constant Stabilizing the path update only in Double Scheme!!
Larger Network (8x8 Mesh) Similar behavior!! ST SRL DS Only Double Scheme stabilizes networks during the path update!!
We apply network reconfiguration techniques to power-aware on/off networks for HPC Links consume ~63% of switch power On/off link activation reduces power It must accept the topology change Network reconfiguration smoothly supports the path update Stabilizing the update of new/old paths Avoiding deadlocks of new/old paths Cycle-accurate simulation shows its impact on the power-aware on/off networks Double Scheme (dynamic NW reconf) maintains performance, stabilizing networks, deadlock avoidance Network reconfiguration is essential for realizing the power-aware on/off networks for HPC systems Conclusions
Acknowledgment This work was partially supported by JST CREST (ULP-HPC: Ultra Low-Power, High-Performance Computing via Modelling and Optimization of Next Generation HPC Technologies)