1 / 15

A Deterministic Single-path Routing Scheme for 2-Level Generalized Fat-trees

A Deterministic Single-path Routing Scheme for 2-Level Generalized Fat-trees. Wickus Nienaber, Santosh Mahapatra, Xin Yuan Department of Computer Science, Florida State University. Motivation. Fat-tree topologies are widely deployed in HPC environments.

charis
Download Presentation

A Deterministic Single-path Routing Scheme for 2-Level Generalized Fat-trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Deterministic Single-path Routing Scheme for 2-Level Generalized Fat-trees • Wickus Nienaber, Santosh Mahapatra, Xin Yuan • Department of Computer Science, • Florida State University

  2. Motivation • Fat-tree topologies are widely deployed in HPC environments. • Deterministic single path routing is often used, in particular in InfiniBand networks. • In this work, we propose a new deterministic single path routing scheme for 2-level generalized fat-trees that optimizes for permutation communications.

  3. Generalized 2-level fat-tree topology • 2-level fat-trees can be characterized by three parameters • n: number of machines connected to each switch • m: number of upper level switches • r: number of lower level switches

  4. Generalized 2-level fat-tree topology • Cross-bisection bandwidth (CBB) ratio • Full bisection bandwidth fat-trees • Slimmed fat-trees • Fatted fat-trees

  5. Routing issues • How to route traffic to maximize performance for permutation patterns? • Permutation pattern: each input port can connect to an arbitrary output port. Each port can used once in the pattern. • Non-blocking interconnects provide non-blocking communication for any permutation pattern. • Single path deterministic routing: one path for each source destination pair. • Fat-trees with deterministic routing are not nonblocking although the topology may not nonblocking. 0 1 2 0 1 2 3 4 5 6 7 8

  6. Performance metrics • Worst-case permutation load (WORSTR): worst case maximum link load across all permutation patterns • Average case permutation load: average of maximum link load for all permutation patterns • Corresponding to Torsten’s effective bisection bandwidth • We develop a single path routing scheme that achieves the optimal worst-case permutation load. • This routing scheme also provides high average permutation performance in comparison to existing routing schemes.

  7. The lower bound of the worst-case permutation load The worst-case permutation load of any single path routing scheme for any T(n+m, r) is at most n.

  8. Worst-case permutation load for existing routing schemes • Destination-mod-k (D-mod-k) and source-mod-k routing • In D-mod-k, traffic for (s, d) is routed through top level switch d mod m. • For many fat-trees, the worst-case permutation load is n. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 In T(4+4, 8): (0, 4), (1, 8), (2, 12), (3, 16): all go through switch 0.

  9. A routing scheme with optimal worst-case permutation load • Basic idea • If a link carries traffic from X sources (or to X destinations), its load would be not more than X for any permutation pattern. • Each link carries traffics either from at most sources or to at most destinations, the maximum link load can be at most . • With destination-mod-k, some of the links carry traffics from n sources to more than n destinations. (0, 100), (0, 200), (0, 300), … (1, 101), (1, 201), (1, 301), …

  10. A routing scheme with optimal worst-case permutation load • Algorithm OPT: • Partition the n ports in each lower level switch into groups. • Each group has members. • Comm. from group i to group j go through top switch . • OPT has the worst case permutation load of . 0 1 2 3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 (0, 4), (1, 8) go through switch 0, (2, 12), (3, 16) go through switch 2. Group 0 Group 1

  11. A routing scheme with optimal worst-case permutation load • What if is not an integer? • The algorithm will only use top level switches (e.g m = 8, only 4 top level switches will be used). • We use a heuristic to re-balance SD pairs to unloaded links to improve the average case performance.

  12. Performance: Worst case permutation load

  13. Average case performance • Three types of permutations: • Bisection patterns • Full permutation patterns • Dissemination patterns (Bruck’s all-to-all pattern) • Use the average of a large number of random patterns to approximate the average case performance. • Starting from 1000 random patterns • Double the number of random samples until the 99% confidence interval is no more than 1% of the average.

  14. Performance: average bandwidth

  15. Conclusion • Existing single path routing schemes are not ideal for permutation communications in 2-level fat-trees. • Our proposed routing scheme achieves optimal worst-case permutation performance. • Our scheme also achieves high average performance for permutation communications.

More Related