300 likes | 459 Views
Exploring Efficient and Scalable Multicast Routing in Future Data Center Networks Dan Li, Jiangwei Yu, Junbiao Yu, Jianping Wu Tsinghua University Presented by DENG Xiang. 1. Outline I Introduction and background II Build an efficient multicast tree
E N D
Exploring Efficient and Scalable Multicast Routing in Future Data Center Networks Dan Li, Jiangwei Yu, Junbiao Yu, Jianping Wu Tsinghua University Presented by DENG Xiang 1
Outline I Introduction and background II Build an efficient multicast tree III Make multicast routing scalable IV Evaluation V Conclusion
Introduction and background Data Centers • the core of cloud services • online cloud applications • back-end infrastructural computations • servers and switches • popularity of group communication
Multicast • save network traffic • improve application throughput
When Multicast meets data center networks... Problem A: Data center topologies usually expose high link density and traditional technologies can result in severe link waste. Problem B: Low-end commodity switches are largely used in most data center designs for economic and scalability consideration.
Build an efficient Multicast tree Data Center Network Architecture • BCube • Portland • VL2 (similar to Portland)
BCube • constructed recursively: BCube(n,0), BCube(n,1)...BCube(n,k) • each server has k+1 ports • each switch has n ports • number of servers: nk+1
Portland • three-level and n pods • aggregation level and edge level: n/2 switches with n ports • core level: (n/2)2 switches with n ports • number of servers: n3/4
Consistent themes lie in them • use low-end switches in the view of expense • high link density exists • data center structure is built in a hierarchical and regular way
In order to save network traffic, how to build an efficient Multicast tree • traditional receiver-driven Multicast routing protocols originally for the Internet, such as PIM • approximate algorithm of Steiner tree Steiner tree problem: to build a Multicast tree with the lowest cost covering the given nodes • source-driven tree building algorithm the proposed algorithm
group spanning graph • each hop is a stage • stage 0 includes the sender only • stage d is composed of receivers • d is the diameter of data center topology
Build Multicast tree in a source-to-receiver expansion way upon the group spanning graph, with the tree node set from each stage strictly covering downstream receivers definition of cover: A covers B if and only if for each node in B, there exists a directed path from a node in A A strictly covers B when A covers B and any subset of A does not cover B.
algorithm details in BCube: • select the set of servers(assume the set is E) from stage 2 which are covered by sender s and a single switch in stage 1(assume it is W) • |E| of the BCube(n,k-1)s has a server in E as the source p, and the receiver set in stage 2*(k+1) covered by p. • the other BCube(n,k-1) has s as the source and receivers in stage 2*k covered by s but not by W as the receiver set
algorithm details in Portland: • From the first stage to the stage of core-level switches, any single path can be chosen, because any single core-level switch can cover the downstream receivers. • From the stage of core-level switches to the final stage of receivers, the paths are fixed due to the interconnection rule in PortLand.
Make Multicast routing scalable a mechanism of packet forward to support massive Multicast group is necessary: • in-packet Bloom Filter For only in-packet Bloom Filter, bandwidth waste is significant for large groups. • in-switch forwarding table For only in-switch forwarding table, very large memory space is needed.
The bandwidth waste ofin-packet Bloom Filter comes from: • theBloom Filter field in the packet brings networkbandwidthcost. • false-positive forwarding by Bloom Filter causestrafficleakage. • switches receiving packets by false-positive forwardingmay further forward packets to otherswitches, incurringnot only additional traffic leakage but alsopossible loops.
we define Bandwidth Overhead Ratior to decribe in- packet Bloom Filter: p--the packet length (including the Bloom Filterfield) f--the length of the in-packet Bloom Filter field t--the number of links in the Multicast tree c--the numberof actual links covered by Bloom Filter based forwarding
with the packet size as 1500 bytes, the relation among r, f and group size: BCube(8,3) Portland with 48-port switches
In-packet Bloom Filter does not accommodate large-size group. So a combination routing scheme is proposed. a) in-packet Bloom Filters are used for small-sized groups to save routing space in switches, while routing entries are installed into switches for large groups to alleviate bandwidth overhead. b) Intermediate switches/servers receiving the Multicast packet check a special TAG in the packet to determine whether to forward the packet via in-packet Bloom Filter or looking up the in-switch forwarding table.
two ways of in-packet Bloom Filter • node-based encoding elements are the tree nodes, including switches and servers and it is chosen. • link-based encoding elements are the directed physical links
false-positive forwarding caused by in-packet Bloom Filter may result in loops. the solution: When a node onlyforwards the packet to its neighboring nodes (withinthe BloomFilter) whose distances to source are larger than itself.
Evaluation evaluation of souce-driven tree buiding algorithm: • BCube(8,3) and 48-port-switch Portland; • 1Gbps link speed; • 200 random-sized groups; • number of links in the tree • computation time
BCube Portland
BCube Portland
evaluation of combination forwarding scheme with 32-byte Bloom Filter:
Conclusion Efficient and Scalable Multicast Routing in Future Data Center Networks • an efficient Multicast tree building algorithm • a combination forwarding scheme for salable Multicast routing