1 / 21

Proteus: A Topology Malleable Data Center Network

Proteus: A Topology Malleable Data Center Network. Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping Zhang (NEC Labs, Princeton). Data Centers. 2. Data centers: Foundation of Internet services, enterprise operation

vaughn
Download Presentation

Proteus: A Topology Malleable Data Center Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proteus:A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping Zhang (NEC Labs, Princeton)

  2. Data Centers 2 • Data centers: Foundation of Internet services, enterprise operation • Need good bandwidth connectivity between servers

  3. “Good” Bandwidth Connectivity Power Consumption? Cabling Complexity CLUE Upgrade to 40/100-GigE? Connect all servers at full bandwidth? Fat-trees [SIGCOMM 2008], VL2 [SIGCOMM 2009]

  4. Oversubscribed Networks • Is all-to-all full bandwidth connectivity always necessary? • Small number of ‘hot’ ToR-ToR connections • Flyways [HotNets 2009] • >90% bytes flow in ‘elephant flows’ • VL2 [SIGCOMM 2009] • ~60% ToRs see <20% change in traffic for between 1.6-2.2 sec • The Case for Fine-grained TE in Data Centers [WREN 2010] • Flyways [HotNets 2009], c-Through and Helios [SIGCOMM 2010] • Supplement electrical network with wireless/optics • Wireless/Optical connections are set up between hot ToRs • Some flexibility to adjust to changes in traffic matrix

  5. Proteus A New Design Point: All-optics Optical Interconnect . . . Servers ToR ToR Proteus is an oversubscribed network with topology malleability topology malleability . . . . . . • Proteus is a novel interconnect above the ToR layer • Topology adjusts to traffic demands • Low cabling complexity • Easier migration to 40/100-GigE • Low power consumption

  6. Malleability Pick Routes H G E F Traffic Change C D Change capacity H H A B D D F F Change topology A A G G C C B B E E

  7. Optics: Perfect Fit MEMS MEMS WSS 1 Gigabit X 64,000 Circuit setup time D C A topology management C B C B B C 64 Terabits* X 1 A A A * Achieved by NEC Labs and AT&T Limited Wavelengths D D D A C B D Low complexity, reconfigurability, low power consumption MEMS = Micro-Electro Mechanical Switch WSS = Wavelength Selective Switch

  8. Problem Setting: Container-sized DCN Image adapted from: www.sun.com/blackbox Proteus-2560: Connect 80 ToRs, each with 32 servers Typical container-size in containerized data center architectures

  9. ToR Perspective Optical Interconnect 32 ports towards interconnect Non-blocking ToR 32 ports for Servers … … Servers

  10. ToR Perspective Limited by ToR port capacity Transit Traffic (Hop-by-hop) Transceivers With Unique Wavelengths Cross-Rack Traffic O O … Non-blocking ToR E … Intra-Rack Traffic • (O-E-O conversions add sub-nanosecond latency at each hop)

  11. ToR13 ToR67 ToR21 ToR11 ToR45 ToR29 ToR73 ToR55 Change Topology Change Capacity Incoming Low Capacity Link High Capacity Link Optical Components Optical Components ToR1 … Outgoing …

  12. MEMS (320 ports) Topology (MEMS) C C C C C C C C To ToR2 S R R S To ToR31 Bi-directionality (Circulators) COUPLER WSS Capacity (WSS) MUX DEMUX … … … … 4 32 … … … ToR59 ToR26 … …

  13. Proteus-2560 Properties • Build any 4-regular ToR topology • Each link’s capacity varies in each direction • Capacity Є {10, 20, 30, …, 320 } Gbps • Provided sum of capacities of 4 links <= 320 Gbps • (Also avoid wavelength contention) • Use hop-by-hop connections to other ToRs • Transit traffic doesn’t interfere with intra-ToR traffic

  14. Topology Management ? ? ? A C D WSS Hop-by-hop routing MEMS Complex problem: All configurations are interdependent C B A D A B C D 14 • We formulate the problem as a mixed-integer linear program • Describe a heuristic approach backed by graph-theoretic insights • Likely to take under a couple of hundred milliseconds

  15. Heuristic Approach – Key Ideas 15 • Topology: Weighted 4-matching over hot ToR-ToR connections • Check and correct for connectivity • Routing: Can use shortest paths • Ideally, need low-congestion routing schemes • Capacities: Graph edge-coloring over wavelengths • Ensure each link carries at least one wavelength

  16. Preliminary Analysis • Cabling: #Fibers ≈ 1/5th#cables in a fat-tree • Ease of upgrade: When ToRs move to 40/100-GigE, nothing else changes! • Cost: similar to a fat-tree • Optics is yet to benefit from commoditization • To some extent, dispels the optics is expensive myth • Power: 50% of fat-tree power consumption • Fat-tree is also fault tolerant though

  17. Conclusion,Ongoing Work Transient Behavior? Routing? Synchronization? • A novel data center architecture • Unprecedented topology flexibility • Reduced cabling complexity • Easier migration to 40/100-GigE • Reduced power consumption • Explores a new design point – all-optics • Experimental evaluation • Incremental update heuristics • Mega-data-center scale • Fault tolerance

  18. Thank You! Questions?

  19. Extras / Backup

  20. Hop-by-hop Through ToRs 20 MEMS – limited end-to-end circuits Need hop-by-hop routes over these circuits Feasibility assessment: works fine!

  21. Helios [SIGCOMM ’10] • Pods are still fat-trees • Requires design-time decision on stable vs. unstable traffic • Does not exploit multi-hop optical routes • Does not leverage WSS technology for variable capacity Image from “Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers” – Farrington et al

More Related