250 likes | 455 Views
Srikanth Kandula, Jitendra Padhye and Victor Bahl Microsoft Research. Flyways in Data Centers. Data Center Networking. Networking is a major cost of building large data centers Switches, routers, cabling complexity, management …. Expensive equipment: aggregation switches cost > $250K
E N D
Srikanth Kandula, Jitendra Padhye and Victor Bahl Microsoft Research Flyways in Data Centers
Data Center Networking • Networking is a major cost of building large data centers • Switches, routers, cabling complexity, management …. • Expensive equipment: aggregation switches cost > $250K • Tradeoff : provide “good” connectivity at “low” cost • Lot of recent interest from industry and academia
Traditional Data Center Networks • 20-40 machines per rack • 1Gbps links to top of the rack (ToR) switch • 160 ToRs per aggregation switch • connected to aggregate switches with 10Gbps links Rest of the DC network … Aggregate Switch Aggregate Switch … ToR … 10Gbps x 160 1Gbps x 20
The oversubscription problem • As one goes up the hierarchy, link capacity does not scale with number of servers • 20 servers w/ 1 Gbps link to ToR switch • 10Gbps uplink from ToR to Aggregation switch • 1:2 oversubscription Aggregate Switch … ToR … 10Gbps 1Gbps x 20
The oversubscription problem • As one goes up the hierarchy, link capacity does not scale with number of servers • 20 servers w/ 1 Gbps link to ToR switch • 10Gbps uplink from ToR to Aggregation switch • 1:2 oversubscription • Implications: • Potential for congestion when communicating between racks • So, applications minimize such communication
Possible solutions • Fewer servers per ToR • More switches, higher cost • Use higher bandwidth links for ToR uplinks • Technological limitations • Today, largest practical link is 40Gbps (4x10Gbps) • Expensive • Either use more aggregation switches • Or build ones with sufficient backplane bandwidth • Clos networks
Non-traditional networks … … … • Lots of inexpensive hardware • Create multiple paths between ToRs • Carefully managed routing • FatTree, VL2, Bcube etc. … … … … … FatTree … … … … … … … … … … … … … … … … … … VL2
Note that …. • Key goal of all proposed solutions is to eliminate oversubscription • There are various other advantages as well • Why? • Need to move VMs anywhere in the network • Network is no longer a bottleneck • Needed for all-pairs-shuffle workload • Is there an alternative to eliminating oversubscription?
1:1 Oversubscription may not always be necessary • Studied application demands from a production cluster • Short-lived, localized congestion • If we can add capacity to “hotspots” as they form, we may not need to eliminate oversubscription
Data set • Production cluster of 1500 servers • Data-mining workload • 1:2 oversubscribed tree • 20 servers per rack (75 total racks) • 1Gbps links from server to ToR • 10Gbps uplink from ToR to aggregation switch • Socket-level traces over several weeks • Demands computed by averaging traffic over 5 minute windows
Only a few ToRs are hot, and most of their traffic goes to a few other ToRs
Our idea • Build a slightly oversubscribed base network • Significant cost savings • Add links between ToRs as and when needed • “Flyways” Aggregate Switch … ToR ToR ToR … … …
Questions … • How to realize flyways • Wireless • Radio: 802.11n, 60Ghz, Free space optics • Wired • Randomized links, Optical switches • Which flyways to enable? • Flyway between which ToRs? • What capacity do flyways need? • …..
60GHz technology • 57-64 GHz • 7GHz bandwidth (802.11b/g has only 80MHz ) • Available worldwide • High bandwidth • 1-4 Gbps links are already available • Low range (1-10 meters) • Advantageous an data center environments • Improves spatial reuse • Line of sight easy to achieve in data center • Antennas on top of the racks • Small form-factor antennas • Steerable directional antennas are feasible • Recent advances in CMOS technology bringing cost down • Sayana, SiBeam, Wilocity, MediaTek IBM/MediaTek 60Ghz chip SiBeamWirelessHD Ref Kit
Which flyways to enable? We propose an algorithm for this …..
Need modest bandwidth Flyways need to carry only a small fraction of ToRs uplink traffic
Evaluation • Trace driven numerical simulations • 1500 servers, 75 racks, 1:2 oversubscribed tree • Flyways do not carry transit traffic • Real data center layout • Wireless: • Ignore interference • Vary range, capacity etc. • Metric: • completion time of demand metric (CTD) • Normalized by completion time in non-oversubscribed network • CTD == 2 no improvement • CTD == 1 equivalent to non-oversubscribed network
Algorithm for placing flyways • Start with no flyways • Solve demand matrix • Find worst laggard in demand matrix • Add flyway for worst pair (Modulo constraints) • Go to (2)
1Gbps flyways, no range restriction With 50 flyways, performance is comparable to non-oversubscribed network
Impact of Flyway capacity 1Gbps flyway capacity appears to be sufficient
Flyway range = 10m Flyways are beneficial even with limited range
Conclusions • A new paradigm for data center networks • Slightly oversubscribed base network with dynamic capacity addition using flyways • Today, 60GHz wireless appears to be a good choice for flyways
Bandwidth needs • Flyways carry far less traffic than uplink • In our model, only ToR-to-ToR (1 hop) • 60GHz band is 9x wide compared to 802.11b/g band • With better encodings, significantly higher capacity is possible