1 / 24

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet. Allen Miu, Eugene Shih 6.892 Class Project December 3, 1999. Overview. Problem Statement Advantages/Disadvantages Operation of Paraloading Goals of Experiment Setup of Experiment

unity
Download Presentation

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet Allen Miu, Eugene Shih 6.892 Class Project December 3, 1999

  2. Overview • Problem Statement • Advantages/Disadvantages • Operation of Paraloading • Goals of Experiment • Setup of Experiment • Current Results • Summary • Questions

  3. Problem Statement: Is “Paraloading” Good? Paraloading is the downloading from multiple mirror sites in parallel. Mirror C Paraloader Mirror A Mirror B

  4. Advantages of Paraloading • Performance is proportional to the realized aggregate bandwidth of the parallel connections • Less prone to complete download failures compared to the single connection download • Facilitates dynamic load balancing among parallel connections • Facilitates reliable, out-of-order delivery (similar to Netscape)

  5. Disadvantages of Paraloading • Can be overly aggressive • Consumes more server resources • Overhead costs for scheduling, maintaining buffers, and sending block request messages • Only effective when mirror servers are available

  6. Step 1: Obtain Mirror List • Hard-coded • DNS? Mirror List Mirror C Paraloader Mirror B Mirror A

  7. Step 2: Obtain File Length Mirror C Paraloader Mirror B Mirror A

  8. Step 3: Send Block Requests Mirror C Paraloader Mirror B Mirror A

  9. Step 4: Re-order Mirror C Paraloader Mirror B Mirror A

  10. Step 5: Send Next Request Mirror C Paraloader Mirror B Mirror A

  11. Goals of Experiment • Main goal: To compare the performance of serial and parallel downloading • To verify the results of Rodriguez et al. • To examine whether varying the degree of parallelism, the number of mirror servers used, affects performance • To gain experience with paraloading and to find out what issues are involved in designing efficient paraloading systems

  12. Experiment Setup • Implemented a paraloader application in Java, using HTTP1.1 (range-requests and persistent connections) • Files are downloaded at MIT from 3 different sets (kernel, mars, tucows) of 7 mirror servers • Degree of parallelism examined: M = 1, 3, 5, 7 • Downloaded a 1MB and a 300KB file (S = 1MB, 300KB) in 1 hour intervals for 7 days • Block Size = 32KB

  13. Results • Paraloading decreases download time over the average single connection case • Speedup is far from optimal case (aggregate bandwidth) • Block request gaps result in wasted bandwidth • Gaps are proportional to RTT • Congestion at client? Possible but unlikely.

  14. S = 1MB

  15. S = 1MB

  16. S = 763KB, B = 30, M = 4 S - 763K

  17. Acknowledgements • Dave Anderson • Dorothy Curtis • Wendi Heinzelmann • WIND Group

  18. Questions

  19. Summary of Contributions • Implemented a paraloader • Verified that paraloading indeed provides performance gain… sometimes • Increasing degree of parallelism improves overall performance • Performance gains are not as good as those reported by Rodriguez et al.

  20. Future Work • Examine how block size affects performance gain • Examine cost of paraloading • Implement and test various optimization techniques • Perform measurements at different client sites

  21. Paraloading Will Not Be Effective In All Situations • Clients should have enough “slack” bandwidth capacity to open more than one connection • Parallel connections are bottleneck disjoint • Target data on mirror servers is consistent and static • Security and authentication services are installed where appropriate • Data transport is reliable • Mirror locations are quickly and easily obtained

  22. Step-by-step Process of the Block Scheduling Paraloading Scheme 1. Obtain a list of mirror sites 2. Open a connection to a mirror server and obtain file length 3. Divide file length into blocks 4. Send a block request to each open connection 5. Wait for a response 6. Send a new block request to the first connection that finished downloading a block 7. Loop back to 5 until all blocks are retrieved

  23. Paraloading is Not a Well-studied Concept • Byers et al. proposed using Tornado codes to facilitate paraloading. • Rodriguez et al. proposed the block scheduling paraloading scheme that is used in our project

More Related