1 / 7

Large Scale File Distribution Troy Raeder & Tanya Peters

Large Scale File Distribution Troy Raeder & Tanya Peters. The Problem. Distribute a large file to some number of machines useful to deploy new programs, distribute data Chirp_distribute was implemented last year and distribute files using a spanning tree

sal
Download Presentation

Large Scale File Distribution Troy Raeder & Tanya Peters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large Scale File DistributionTroy Raeder & Tanya Peters

  2. The Problem • Distribute a large file to some number of machines • useful to deploy new programs, distribute data • Chirp_distribute was implemented last year and distribute files using a spanning tree • Want to improve upon the existing methods to transfer files more efficiently. • Choke points exist – multiple machines will all transfer files through a single router/switch • Minimizing failures, including permissions errors

  3. The Solution • Take advantage of network topology – transfer across routers and switches as soon as possible, and then machines in the same cluster transfer to each other. • Using traceroute, we build a graph that represents the network. This is done as needed and saved in a file which is loaded at run time. • Access Control Lists: if we know a source machine doesn’t have permissions to transfer to some target, don’t even try

  4. Network Topology

  5. Picking a Target: • Check if all clusters in the graph contain a copy of the file. • If some cluster does not, we copy to it. • Next, if some node within your cluster doesn't have the file, transfer to it. • Otherwise, pick some other node that doesn't have the file. • If a node is unable to transfer to nodes that don't have the file yet, it is removed from the list of possible sources.

  6. Initial Results • Current version of algorithm doesn’t always do better • As expected, for smaller files and/or smaller number of hosts, overhead costs us • For larger files and/or number of hosts, things like timeouts can wash out relative gains.

  7. What's Next... • Pick source & target more intelligently • If initial attempt to copy from some cluster A to cluster B fails, don't try transferring between these two clusters again unless no other possibilities exist. • Try and manage straggler transfers • Dynamically set timeout for transferring a single copy: set to some multiple of max or average transfer time seen so far. • The end result hopefully that we have a significant improvement over existing algorithm

More Related