1 / 23

Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan )

Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks. Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani (U of Tokyo, Japan ) Hideharu Amano(Keio U/ NII, Japan). HPC networks (Infiniband, GbE) On/Off link activation method

Download Presentation

Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan) Hiroki Matsutani (Uof Tokyo, Japan) Hideharu Amano(Keio U/ NII, Japan)

  2. HPC networks (Infiniband, GbE) On/Off link activation method Reducing power consumption of HPC networks Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations Cycle-accurate network simulator Behavior of network during the path change Outline

  3. Network of High-performance computing 60% 50% 40% 30% Number of Supercomputers on Top500 List Percentage on Top500 List 20% 10% 0%

  4. Examples RoadRunner (LANL) BLUEGENE/L (LLNL) TACC (Univ Texas) Propietary 251,904 cores 5th on top500 IBA 212,992 processors 2nd on Top500 list 122,400 cores 1ston Top500 IBA Virginia Tech's X ABE (NCSA) ASCI-Q (LANL) IBA IBA 2,200 cores 280th on Top500 9,600 cores 23th on top500 Quadrics 8,192 cores 2008

  5. HPC Networks  11 7 8 9 10 4 5 6 0 1 2 3 14 15 13 12 TREE 1 TREE 2 TREE 3 TREE 4 4paths • Small switches (24/48-port) provide the lowest cost per port • When 100,000 cores are connected, a large number of small switches are needed • drastically increasing the number of links • Unused and rarely-used links should be deactivated for power-aware HPCs Link aggr. using 3 links switch host

  6. Power cons is almost constant regardless of traffic load # of activated ports dominates the power cons of switches Power cons of port is reduced down to ZERO by port-shutdown operation Power cons of HPC switches Unit:W GbE IB

  7. HPC networks (Infiniband, GbE) On/Off link activation method Reducing power consumption of HPC networks Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations Cycle-accurate network simulator Behavior of network during the path change Outline

  8. Overview of the on/off link method TREE 1 TREE 2 TREE 3 TREE 4  11  11 7 7 8 8 9 9 10 10 4 4 5 5 6 6 0 0 1 1 2 2 3 3 14 14 15 15 13 13 12 12 Switch ports consume 40-60% of the total power of a switch Network load is not always high (e.g. during computation time) switch host Traffic load becomes low (turning off a part of links) TREE 1 TREE 2 TREE 3 TREE 4

  9. A runtime on/off link method Traffic monitoring No Low or high-load links appear TREE 1 TREE 2 TREE 3 TREE 4  11 7 8 9 10 4 5 6 0 1 2 3 14 15 13 12 Yes Selection of on/off links and paths Update of link status and paths Very crucial factor Eg: port monitor, IPTraf, pilot execution Low traffic load is detected Paths: Before & After the before path is deactivated How is NW stabilized during the path-update?

  10. 3 NW Reconfiguration 0 6 0 3 5 1 2 5 4 1 4 6 Rnew 2 Stabilizing network during the path updateNetwork Reconfiguration (deadlock avoidance) Switch Link Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may deadlock Rold=Routing Table before the update Rnew=Routing Table after the update

  11. 6 2 6 2 Network Reconfiguration 3 Reconfiguration 0 0 3 5 1 5 4 1 4 Rnew Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may cause deadlock Deadlock Old behind new New behind old

  12. Existing NW reconf tech. on fault-tolerant networks Static reconfiguration Dynamic reconfiguration Traffic is stopped New routing is applied Traffic is resumed Traffic is not stopped Old and new routing coexist Difficulty to avoid deadlock High latencies STATIC RECONFIGURATION(ST) DOUBLE-SCHEME SIMPLE RECONFIGURATION

  13. Current NW Reconfigurations • SR PDA: Simple Reconfiguration: Packet Dropping Aware[Lysne08,TC] • Tokens are sent before update of routing • Packets are sent after updating routing tables • SR LA: Simple Reconfiguration: Latency Aware[Lysne08,IEEE TC] • All new tables are distributed before using new one. • Latency due to the tokens is reduced. • DS: Double Scheme[Pinkston03,TPDS] • Requires 2 virtual channels. • One channel have to be drained • ST:Static Reconfiguration • Traffic injection is completely stopped

  14. HPC Interconnects (Infiniband, GbE) On/Off link activation method Reducing power consumption of HPC networks Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations Cycle-accurate network simulator Behavior of network during the path change Outline

  15. Simulation Environment • Switch model (InfiniBand) • Buffered input (1KB per VL) and output (1KB per VL) ports • Non-multiplexed crossbar with separate ports per VL • FIFO-based crossbar arbiter per output crossbar port • Round-robin arbiter per output port • 100 ns routing time • Link model • Link Speed = 2.5 Gbps (1X links) • Topologies • 2D mesh networks • Traffic model • Packet lengths are 58 bytes • Uniform • Full range of traffic, from low load to saturation

  16. Evaluation Results We twice apply NW reconf. process to each execution: • Deactivating links, after decrease the traffic injection • Re-activating links, after increase the traffic injection We evaluated full range of initial traffic injection, (from low traffic-to near congestion)

  17. Static Reconfiguration (ST) Traffic increases, a link is reactivated Traffic decreases, a link is deactivated (a) Low Traffic Load Latency is high Traffic load decreases Traffic load increases Latency is high (b) High Traffic Load At each on/off link operation, traffic is not stabilized in ST!!

  18. SR-LA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-LA!!

  19. SR-PDA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-PDA!!

  20. Double Scheme (dynamic reocnfiguration) (a) Low Traffic Load Traffic load decreases Traffic load increases Latency is constant (b) High Traffic Load Latency is constant Stabilizing the path update only in Double Scheme!!

  21. Larger Network (8x8 Mesh) Similar behavior!! ST SRL DS Only Double Scheme stabilizes networks during the path update!!

  22. We apply network reconfiguration techniques to power-aware on/off networks for HPC Links consume ~63% of switch power On/off link activation reduces power It must accept the topology change Network reconfiguration smoothly supports the path update Stabilizing the update of new/old paths Avoiding deadlocks of new/old paths Cycle-accurate simulation shows its impact on the power-aware on/off networks Double Scheme (dynamic NW reconf) maintains performance, stabilizing networks, deadlock avoidance Network reconfiguration is essential for realizing the power-aware on/off networks for HPC systems Conclusions

  23. Acknowledgment This work was partially supported by JST CREST (ULP-HPC: Ultra Low-Power, High-Performance Computing via Modelling and Optimization of Next Generation HPC Technologies)

More Related