220 likes | 232 Views
Explore a scalable and efficient reliable group data delivery service for data centers. Learn about the challenges, existing solutions, and new opportunities in data center networks. Discover the architecture, transport protocol, simulations, and experiments of this innovative data delivery system.
E N D
Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, YixinZheng, Yongguang Zhang, Yibo Zhu, Chen Chen University of Science and Technology of China Microsoft Research Asia Tsinghua University University of California, Santa Barbara University of Pennsylvania
Reliable Group Data Delivery The problem of RGDD is: In a data center network, given a data source, Src, and a set of receivers, R1, R2, …, Rn, how to reliably transmit bulk data from Src to all the receivers? <1,0> <1,1> <1,2> <1,3> <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 Data Data Data Data Data Data
Reliable Group Data Delivery • RGDD is important in DCNs: • Bootstrapping or OS upgrading. • Distributed file systems, e.g., GFS. • VM setup. • And more...
Reliable Group Data Delivery A good RGDD design should have the following properties: • Scalable (large group numbers and large group sizes) • High bandwidth efficiency
Existing solutions to RGDD Existing solutions can be classified into two categories: • Reliable IP multicast. Not scalable, e.g., ACK implosion. • End-host based overlays. Low bandwidth efficiency. None of the existing systems can perfectly achieve RGDD.
New opportunities in DCN Recently, there are two clear trends in DCN: • Multiple edge-disjoint Steiner trees for RGDD. • Practical packet caching abilities in network devices. We can cache packet! 00 00 <0,0> <1,0> 01 02 03 10 20 30 <1,1> <1,2> <1,3> <0,1> <0,2> <0,3> 11 21 31 12 22 32 13 23 33 11 12 13 21 22 23 31 32 33 <0,1> <0,2> <0,3> <1,1> <1,2> <1,3> 10 20 30 01 02 03
The architecture of Datacast How to efficiently transmit data in each Steiner tree? Master j Src R1 R2 Fabric Manager RGDD Group in Src RGDD Group i2 RGDD Group i1 Network Topology Master i IMD How to calculate multiple Steiner trees? R3 R4 R1 R2
Multiple edge-disjoint Steiner trees in DCN • Our multiple Steiner trees algorithm takes three steps: • Use specific algorithms to construct spanning trees. • Prune the spanning trees. • Use Breath First Search(BFS) to repair the trees broken by network failures. This algorithm is fast (O(k|V|) + O(|E|) + O(k|E|)) and efficient.
Datacast transport protocol • Datacast is built on top of Content Centric Network (CCN): Data Data Data Data Data 00 01 02 03 Data Data Inst Inst Data 10 11 12 13 Inst 20 21 22 23 Inst Data 30 31 32 33
Datacast transport protocol • Datacast uses a rate based congestion control algorithm at the source side:
Datacast transport protocol • We built a fluid model for Datacast congestion control algorithm. • and are the current data positions of the data source and the slowest receiver. • R is the “desired” rate of the slowest receiver.
Datacast transport protocol • Theorem 1 (Cache Usage). Datacast works at the full rate, i.e., the rate of the slowest receiver, R, if the cache size, C, satisfies When R is 100Mbps, is 5Mbps, T is 1ms, C only needs to be greater than 125KB.
Datacast transport protocol • Theorem 2 (Duplicate Data Ratio). When Datacast works at the full rate, the duplicate data ratio of Datacast is When R is 100Mbps, is 5Mbps, T is 1ms, and MTU is 1.5KB, the duplicate data ratio is 1.19%.
Simulation: multiple Steiner trees algorithm • We tested our algorithm in Fattree(24,3), BCube(8, 3), Torus(16, 3) under the link failure rates (LFR) of 1%, 3% and 5%. Running times. Steiner tree numbers.
Simulation: Datacast congestion control • Simulation Setup: • BCube(4, 1) with 1Gbps links. • The link from <0,0> to 02 is slowed down to 100Mbps. • , T = 1ms • MTU = 1.5KB 00 00 <1,0> <0,0> 10 20 30 01 02 03 <0,1> <0,2> <0,3> <1,1> <1,2> <1,3> 11 12 13 21 22 23 31 32 33 11 21 31 12 22 32 13 23 33 <1,1> <1,2> <1,3> <0,1> <0,2> <0,3> 01 02 03 10 20 30 Steiner Tree 1. Steiner Tree 2.
Simulation: Datacast congestion control • Based on Theorem 1, Datacast needs 125KB caches to work at full rate. • Based on Theorem 2, the duplicate data ratios is 1.19%.
Simulation: Datacast congestion control • Compare with BitTorrent. Fattree. BCube. Torus.
Experiment: Datacast congestion control • We build Datacast with ServerSwitch platform, and evaluate it in a BCube(4, 1). • = 5Mbps, T = 1ms, MTU = 8KB. • Based on Theorem 1, Datacast works at the full rate when C > 256KB. • When C = 64KB, the average throughput is 91.998Mbps. • When C = 256KB, the average throughput is 99.595Mbps. • Based on Theorem 2, the duplicate data ratio should be 3.48%. • When Datacast works at the full rate, the measured duplicate data ratio is 3.10%.
Experiment: Datacast congestion control • We compare Datacast with BitTorrent. We use both of them to transmit 4GB data.
Related work • Reliable IP multicast • Pgm congestion control (pgmcc) • Active Reliable Multicast (ARM) • End-host based overlays • SplitStream • End System Multicast • Cornet
Conclusion • In this paper, we propose Datacast which • Calculates multiple edge-disjoint Steiner trees in DCNs • Uses CCN to turn hard group states to soft packet caching • Uses a simple rate-based AIMD congestion control algorithm to achieve high efficiency Datacast is scalable and achieves high bandwidth efficiency