110 likes | 135 Views
Research presented by Chris Miller & Pramita Mitra on distributing large datasets efficiently over distributed networks. Utilizing CCL storage pool model for sequential and parallel file distribution to optimize network resources. Evaluation of Best Neighbor Approximation method for improved distribution speed.
E N D
Large Scale File DistributionSequential Branching Distribution Final Presentation Grad Operating Systems Presented by Chris Miller & Pramita Mitra Dec 13, 2006
Problem Statement • Research requires distribution of large datasets on distributed networks • Methods such as multicast are too complicated to implement reliably • Tools available for file distribution • Chirp • Parrot • Algorithm needed to efficiently schedule the distribution of files
Solution • Using CCL storage pool as model of distributed network • Using small, measured steps to find what aspects of distribution work best in implementation • Sequential distribution • Parallel distribution Distributor Ineffiecient use of network resources. Total time for distribution O(n). Stage 1 … Stage n Stage 2 Distributor Node 1 Node 2 … Node n Total time for distribution O(n).
Sequential Branching Distribution Nodeset Distributor Thirdput Thirdput Stage 2 Stage 1 Stage 3 Thirdput Stage 2 Stage 3 Stage 3 Stage 3 Total time for distribution O(log2 n)
Conclusions • A fast and reliable distribution method is possible with simple file transfer methods • Distribution system is fault tolerant for all nodes except distributor node • Latency measurement • moderate indicator of transfer rate • low overhead • Small file transfer approximation • strong indicator of transfer rate • high overhead • Performance is near O(log2 n)