420 likes | 561 Views
Weighted Random Oblivious Routing on Torus Networks. Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering University of California, San Diego. Networks-On-Chip. Chip-multiprocessors (CMPs) increasingly popular
E N D
Weighted Random Oblivious Routing on Torus Networks Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering University of California, San Diego
Networks-On-Chip • Chip-multiprocessors (CMPs) increasingly popular • Torus, Mesh, Flattened Butterfly – candidate architectures for on-chip networks Intel Larrabee Tilera Tile64
Networks-On-Chip • Chip-multiprocessors (CMPs) increasingly popular • Torus, Mesh, Flattened Butterfly – candidate architectures for on-chip networks Folded Torus 2D Torus
Outline • Motivation • Related Work • Optimal routing for rings • Optimal routing for 2D torus
Optimal Oblivious Routing • Cast as a Multi-commodity flow problem • Maximize worst-case throughput • Minimize hop-count • Solve using Linear Programming • Impractical for large networks • Number of paths too large (exponential) • Hard to make it deadlock-free • LP not scalable
Optimal 2TURN • Optimum oblivious routing with only 2TURN paths. 0,3 1,3 2,3 3,3 0,2 1,2 3,2 2,2 0,1 1,1 2,1 3,1 1,0 2,0 3,0 0,0
Optimal 2TURN • Optimum oblivious routing with only 2TURN paths. 0,3 1,3 2,3 3,3 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 0,2 1,2 2,2 3,2 0,1 0,1 1,1 1,1 2,1 2,1 3,1 3,1 0,0 1,0 2,0 3,0 0,0 1,0 2,0 3,0
ValiantLoad Balancing(VAL) 2 phases of X-Y routing 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 0,1 1,1 2,1 3,1 0,0 1,0 2,0 3,0
Improved Valiant Routing (IVAL) Phase1: X-Y, Phase2: Y-X 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 0,1 1,1 2,1 3,1 0,0 1,0 2,0 3,0
Improved Valiant Routing (IVAL) Phase1: X-Y, Phase2: Y-X 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 0,1 1,1 2,1 3,1 0,0 1,0 2,0 3,0
Latency Comparison 13.5%
Evolution of W2TURN Step 1. Started with the simple case of 1D rings • Developed Weighted Random Direction (WRD) Step 2. Described 2TURN paths in IVAL in terms of routing on 1D segments (I2TURN) • I2TURN has analytical expression for hop count. Step 3.Combined the intuition gained from WRD, I2TURN and optimal 2TURN • Developed Weighted random 2TURN routing (W2TURN) • Analytically showed latency of W2TURN strictly better than I2TURN
Outline • Motivation • Related Work • Optimal routing for rings • Optimal routing for 2D torus
Routing on Rings • Randomized Load Balancing (RLB) – Optimal worst-case throughput for rings • Same routing strategy for both odd and even radix networks
Some Facts … • Worst-case throughput determined by maximum channel load under most adversarial traffic • For a torus network with radix k, • Maximum channel for worst-case throughput optimality = k/4 Even k = k/4 – 1/4k Odd k
Rings – The Difference Between Oddand Even • RLB: Route minimally with probability (k-∆)/k • Why can’t we route minimally more often? Tornado traffic ∆ = (k-1)/2 Total Channel load = (k-1)/2 * (k+1)/2k = k/4 - 1/4k = Maximum load for worst-case throughput optimality
Rings – The Difference Between Odd and Even • RLB: Route minimally with probability (k-∆)/k. • Can we route minimally more often? Route minimally with a probability of (k-∆-1)/(k-2) > (k-∆)/k Tornado traffic ∆ = k/2-1 Total Channel load = (k/2 – 1) * (k+2)/2k = k/4 – 1/k < Maximum load for worst-case throughput optimality
WRD Algorithm • Odd radix: • Route minimally with probability (k-∆)/k • Route non-minimally with probability ∆/k • Even radix: • Route minimally with probability (k-∆-1)/(k-2) when k > 2 and ∆ > 0 • Route non-minimally with probability (∆-1)/(k-2) when k > 2 and ∆ > 0
Outline • Motivation • Related Work • Optimal routing for rings • Optimal routing for 2D torus
I2TURN • Describe 2TURN paths in terms of 1D segments. • 2TURN paths: X-Y-X or Y-X-Y • X-Y-X routing • Select intermediate X position x* at uniform random • Route minimally to x* • Route using RLB on the Y ring at X=x* 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 0,1 1,1 2,1 3,1 1,0 0,0 2,0 3,0
I2TURN • Describe 2TURN paths in terms of 1D segments. • 2TURN paths: X-Y-X or Y-X-Y • X-Y-X routing • Select intermediate X position x* at uniform random • Route minimally to x* • Route using RLB on the Y ring at X=x* 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 1/4 0,1 1,1 2,1 3,1 0,0 1,0 2,0 3,0
I2TURN • Describe 2TURN paths in terms of 1D segments. • 2TURN paths: X-Y-X or Y-X-Y • X-Y-X routing • Select intermediate X position x* at uniform random • Route minimally to x* • Route using RLB on the Y ring at X=x* • Route minimally to the destination 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 3/4 1/4 0,1 1,1 2,1 3,1 0,0 1,0 2,0 3,0
I2TURN – Main Idea • For XYX routing, load balance across the Y-rings to make traffic along every Y-ring admissible • Use worst-case throughput optimal routing (RLB) on the Y-ring • Can easily derive analytical expression for average packet latency • Can be proved to be equivalent to IVAL. Hence, it is worst-case throughput optimal • Can define YXY routing by swapping dimensions
W2TURN – Even Radix • Reduces latency over I2TURN • Use WRD instead of RLB • Interpolate X-Y-X and Y-X-Y 2TURN routing with minimal X-Y and Y-X routing • XYX : k/2(k+1) • YXY : k/2(k+1) • XY: 1/2(k+1) • YX: 1/2(k+1)
X-Y-X W2TURN • X-Y-X routing • Select intermediate X position x* at uniform random • Route minimally to x* • Route using WRD on the Y ring at X=x* 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 0,1 1,1 2,1 3,1 1,0 0,0 2,0 3,0
X-Y-X W2TURN • X-Y-X routing • Select intermediate X position x* at uniform random • Route minimally to x* • Route using WRD on the Y ring at X=x* 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 1 0,1 1,1 2,1 3,1 1,0 0,0 2,0 3,0
X-Y-X W2TURN • X-Y-X routing • Select intermediate X position x* at uniform random • Route minimally to x* • Route using WRD on the Y ring at X=x* • Route minimally to the destination 0,3 1,3 2,3 3,3 0,2 1,2 2,2 3,2 1 0,1 1,1 2,1 3,1 1,0 0,0 2,0 3,0 When number of hops in both directions are equal, avoid using links used by minimal X-Y or Y-X routing.
W2TURN – Odd Radix • W2TURN = Optimal 2TURN for odd radix • More elaborate description but easy to implement • Uses X-Y-X and Y-X-Y 2TURN routing with equal probability • Most of the intuition gained by observing optimal 2TURN paths
Latency Evaluation 13.5%
W2TURN ≈ Optimal-2TURN W2TURN = Optimal-2TURN for odd radix W2TURN within 0.72% of Optimal-2TURN for even radix
Summary of Contributions • WRD: Optimal routing algorithm for rings • Worst-case throughput optimal • Minimum hop count • W2TURN-Odd: Optimal 2TURN routing with a closed form description for 2D torus with odd radix • W2TURN-Even: Latency within 0.072% of optimal 2TURN routing for 2D torus with even radix • WRD and W2TURN are best performing closed-form algorithms for 1D and 2D torus!!
Proof of worst-case throughput optimality • Optimal worst-case channel load = 2*(Channel load for uniform traffic) • To prove a routing is worst-case throughput optimal, sufficient to prove that maximum channel load: = k/4 when k is even. = k/4 – 1/4k when k is odd.