200 likes | 289 Views
A Virtual Circuit Multicast Transport Protocol (VCMTP) for Scientific Data Distribution. Jie Li and Malathi Veeraraghavan University of Virginia Steve Emmerson University Corporation for Atmospheric Research Robert D. Russell University of New Hampshire April 23, 2013.
E N D
A Virtual Circuit Multicast Transport Protocol (VCMTP) for Scientific Data Distribution JieLi and MalathiVeeraraghavan University of Virginia Steve Emmerson University Corporation for Atmospheric Research Robert D. Russell University of New Hampshire April 23, 2013 • This work was supported by the NSF grants OCI-1038058 • and OCI-1127340, and DOE grants DE-SC002350 and DESC0007341
Background • Internet Data Distribution (IDD) Project • Developed by University Corporation for Atmospheric Research (UCAR) • Distributes real-time meteorology data • 10 GB/hour data generation rate • Subscriber base: 170 institutions • Software used for distribution: Local Data Manager (LDM)
Question • Which of these network services is best suited for IDD data? • IP routed service + unicast TCP (current mechanism) • Static circuits (leased lines) • if continuous data flow, is this an option? • Scheduled dynamic circuit service (DCS) • if data flow is long-lived, option? • P2P • Multicast
To answer this question • Per-flow data characteristics insufficient • typical classification: • loss-sensitive, high throughput • delay-sensitive, low latency • Instead, need distribution topology • consider whole network view
CONDUIT data • Installed and configured the LDM to receive CONDUIT data from UCAR • Parsed and analyzed the log files for received data(9 sample days) • Peak throughput: 250 MB/minute (SD: 28.8 MB/minute) • Total size of generated data: ~60 GB/day (SD: 0.3 GB/day)
Distribution structure • Downloaded and parsed real-time statistics of the CONDUIT feed tree • Data Distribution Topology of the CONDUIT feedtype • For the max fan-out of 104 receivers, the peak bandwidth requirement is 104 * 250 MB/minute ≈ 3.5Gbps • This is just for a single feedtype of a single application CONDUIT Feed Tree Topology Information * This maximum fan-out number is forthe UCAR site (idd.unidata.ucar.edu)
CONDUIT distribution topology http://www.unidata.ucar.edu/cgi-bin/rtstats/rtstats_topogif?CONDUIT
Answer to question • Different network service types • Static unicast VCs: unsuitable • Divide NCAR access link bandwidth between 104 subscribers: if 10 Gbps, then ~10 Mbps per subscriber • Subscribers would like to receive the data asap (low rate VC will increase latency) • Dynamic unicast VCs:unsuitable • For the worst-case fanout of 104, the total delay will be greater than with IP service, since for each receiver a new circuit needs to be set up, which can only be done after the transfer to the previous receiver is complete and the circuit to that receiver is released. • Multicast: can save bandwidth and computing resource
New options: multicast and P2P • Multicast • Pros: total delay for distributing the data to the receivers will be lower for a given computing capacity of the upstream servers, or conversely, the same transfer delay can be achieved as with IP-routed service or P2P but with smaller upstream server computing capacity. • Cons: one or more slow receivers can slow down everyone • P2P • Pros: scales better with the number of receivers; suitable when files are obtained by different participants at different times • Cons: not suitable for real-time or near real-time delivery (which is a key requirement of IDD)
VCMTP Requirements • Goal: Design and implement a reliable and scalable transport protocol for data distribution over high-speed multipoint virtual circuits • Requirements • Reliability: error control, flow control • Scalability: support at least hundreds of receivers • High-speed multicast: support Gbps transfers
VCMTP Operational Overview • A Negative Acknowledgment (NACK) based reliable transport protocol • Data blocks transmitted over a multicast network service (can be unreliable) • Retransmissions carried over a reliable unicast service (e.g., TCP)
VCMTP Prototyping • A user-level library implemented in C++ for Linux OS environment • Asynchronous programming model • Simultaneous data multicast and retransmission VCMTP Sender Process … Sending Thread Retransmission Thread 1 Retransmission Thread N Coordinator Thread Receiving Thread Retransmission Request Thread … Receiving Thread Retransmission Request Thread VCMTP Receiver Process 1 VCMTP Receiver Process N
Evaluation Metrics for Continuous File Transfers • Metric for fast receivers: Throughput • nf: number of fast receivers • m:number of continuously sent files • Fi: size of file i • Ti,vcmtp: transfer time for file i • Metric for slow receivers: Robustness • ns: number of slow receivers • m: number of continuous files • Sij: an indicator variable that is set to 1 if file i was successfully received at receiver j, or 0 otherwise
Experimental Evaluation: Throughput • Experiments conducted in the Emulabtestbed (hosted by Univ. of Utah) • 40% slow receivers experienced random packet drops at different rates • Rho is the traffic intensity calculated from the average file size (Pareto distribution) and inter-arrival time (exponential distribution); link rate = 100 Mbps • Experiment: 500 files; repeat 5 times
Key Evaluation Observations • Increase in total number of receivers (and hence number of slow receivers) has adverse impact on both robustness and throughput because of resource contention • Both robustness and throughput decrease as traffic intensity (Rho) or loss rate increases • The sending-side retransmission timeout factor offers a knob for trading off robustness against throughput
Summary • Multicast VCs are suitable for scientific data distribution applications • VCMTP: a reliable multicast transport protocol is designed, prototyped, and evaluated • Tradeoff between robustness and throughput for continuous file delivery
Thank You! & Questions?