300 likes | 455 Views
Symbiotic Routing in Future Data Centers. Hussam Abu-Libdeh , Paolo Costa, Antony Rowstron, Greg O’Shea, Austin Donnelly Cornell University Microsoft Research Cambridge. Data center networking. Network principles evolved from Internet systems Multiple administrative domains
E N D
Symbiotic Routingin Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron,Greg O’Shea, Austin Donnelly Cornell University Microsoft Research Cambridge
Data center networking • Network principles evolved from Internet systems • Multiple administrative domains • Heterogeneous environment • But data centers are different • Single administrative domains • Total control over all operational aspects • Re-examine the network in this new setting
Rethinking DC networks • New proposals for data center network architectures • DCell, BCube, Fat-tree, VL2, PortLand … • Network interface has not changed! Network Interface Bandwidth Performance Isolation Fault Tolerance Graceful Degradation Scalability TCO Commodity Components Modular Design . . .
Challenge • The network is a black box to applications • Must infer network properties • Locality, congestion, failure …etc • Little or no control over routing • Applications are a black box to the network • Must infer flow properties • E.g. Traffic engineering/Hedera • In consequence • Today’s data centers and proposals use a single protocol • Routing trade-offs made in an application-agnostic way • E.g. Latency, throughput, …etc
CamCube • A new data center design • Nodes are commodity x86 servers with local storage • Container-based model 1,500-2,500 servers • Direct-connect 3D torus topology • Six Ethernet ports / server • Servers have (x,y,z) coordinates • Defines coordinate space • Simple 1-hop API • Send/receive packets to/from 1-hop neighbours • Not using TCP/IP • Everything is a service • Run on all servers • Multi-hop routing is a service • Simple link state protocol • Route packets along shortest paths from source to destination y z (0,2,0) x
Development experience • Built many data center services on CamCube • E.g. • High-throughput transport service • Desired property: high throughput • Large-file multicast service • Desired property: low link load • Aggregation service • Desired property: distribute computation load over servers • Distributed object cache service • Desired property: per-key caches, low path stretch
Per-service routing protocols • Higher flexibility • Services optimize for different objectives • High throughput transport disjoint paths • Increases throughput • File multicast non-disjoint paths • Decreases network load
What is the benefit? • Prototype Testbed • 27 servers, 3x3x3 CamCube • Quad core, 4 GB RAM, six 1Gbps Ethernet ports • Large-scale packet-level discrete event simulator • 8,000 servers, 20x20x20 CamCube • 1Gbps links • Service code runs unmodified on cluster and simulator
Service-level benefits • High throughput transport service • 1 sender 2000 receivers • Sequential iteration • 10,000 packets/flow • 1500 bytes/packet • Metric: throughput • Shown: custom/base ratio
Service-level benefits • Large-file multicast service • 8,000-server network • 1 multicast group • Group size: 0% 100% of servers • Metric: # of links in multicast tree • Shown: custom/base ratio
Service-level benefits • Distributed object cache service • 8,000-server network • 8,000,000 key-object pairs • Evenly distributed among servers • 800,000 lookups • 100 lookups per server • Keys picked by Zipf distribution • 1 primary + 8 replicas per key • Replicas unpopulated initially • Metric: path length to nearest hit
Network impact • Ran all services simultaneously • No correlation in link usage • Reduction in link utilization • Take-away: custom routing reduced network load and increased service-level performance
Symbiotic routing relations • Multiple routing protocols running concurrently • Routing state shared with base routing protocol • Services • Use one or more routing protocols • Use base protocol to simplify their custom protocols • Network failures • Handled by base protocol • Services route for common case Service A Service B Service C Routing Protocol 1 Routing Protocol 2 Routing Protocol 3 Base Routing Protocol Network
Building a routing framework • Simplify building custom routing protocols • Routing: • Build routes from set of intermediate points • Coordinates in the coordinate space • Services provide forwarding function ‘F’ • Framework routes between intermediate points • Use base routing service • Consistently remap coordinate space on node failure • Queuing: • Services manage packet queues per link • Fair queuing between services per link packet F next coord local coord
Example: cache service • Distributed key-object caching • Key-space mapped onto CamCube coordinate space • Per-key caches • Evenly distributed across coordinate space • Cache coordinates easily computable based on key
Cache service routing • Routing • Source nearest cache or primary • On cache miss: cache primary • Populate cache: primary cache • F function computed at • Source • Cache • Primary • Different packets can use different links • Accommodate network conditions • E.g. congestion v v v v F F F source/querier nearest cache primary server
Handling failures • On link failure • Base protocol routes around failure • On replica server failure • Key space consistently remappedby framework • F function does not change • Developer only targets common case • Framework handles corner cases v F source/querier nearest cache primary server
Cache service F function protectedoverrideList<ulong> F(intneighborIndex, ulongcurrentDestinationKey, Packet packet) {List<ulong> nextKeys = newList<ulong>();ulongitemKey = LookupPacket.GetItemKey(packet);ulongsourceKey = LookupPacket.GetSourceKey(packet);if (currentDestinationKey == sourceKey) // am I the source? {// get the list of caches (using KeyValueStore static method)ulong[] cachesKey = ServiceKeyValueStore.GetCaches(itemKey);// iterate over all cache nodes and keep the closest onesintminDistance = int.MaxValue;foreach (ulongcacheKeyincachesKey) {int distance = node.nodeid.DistanceTo(LongKeyToKeyCoord(cacheKey));if (distance < minDistance) {nextKeys.Clear();nextKeys.Add(cacheKey);minDistance = distance; }elseif (distance == minDistance)nextKeys.Add(cacheKey); } }elseif (currentDestinationKey != itemKey) // am I the cache?nextKeys.Add(itemKey);returnnextKeys; } extract packet details if at source, route to nearest cacheor primary if cache miss,route to primary
Framework overhead • Benchmark performance • Single server in testbed • Communicate with all six 1-hop neighbors (Tx + Rx) • Sustained 11.8 Gbps throughput • Out of upper bound of 12 Gbps • User-space routing overhead
What have we done • Services only specify a routing “skeleton” • Framework fills in the details • Control messages and failures handled by framework • Reduce routing complexity for services • Opt-in basis • Services define custom protocols only if they need to
Network requirements • Per-service routing not limited to CamCube • Network need only provide: • Path diversity • Providing routing options • Topology awareness • Expose server locality and connectivity • Programmable components • Allow per-service routing logic
Conclusions • Data center networking from the developer’s perspective • Custom routing protocols to optimize for application-level performance requirements • Presented a framework for custom routing protocols • Applications specify a forwarding function (F) and queuing hints • Framework manages network state, control messages, and remapping on failure • Multiple routing protocols running concurrently • Increase application-level performance • Decrease network load
Thank You! Questions? hussam@cs.cornell.edu
Cache serviceInsert throughput Ingress bandwidth bounded (3 front-ends) Disk I/O bounded
Cache serviceLookup requests/second Ingress bandwidth bounded
Cache serviceCPU Utilization on FEs 3 front-ends 27 front-ends