210 likes | 227 Views
Peer-Assisted Content Distribution. Pablo Rodriguez Christos Gkantsidis. Traditional Content Distribution. Server Farm. Often, large content needs to be distributed to millions of clients: Currently: Huge server farms Infrastructure-based solutions (e.g. Akamai)
E N D
Peer-Assisted Content Distribution Pablo Rodriguez Christos Gkantsidis
Traditional Content Distribution Server Farm Often, large content needs to be distributed to millions of clients: • Currently: • Huge server farms • Infrastructure-based solutions (e.g. Akamai) slow, expensive, non scalable
Content Distribution Evolution Layer-7 Switches Satellite CDNs CDNs Akamai Disappointment Hype P2P Caching IP Multicast Enterprise CDNs Growth Realism 1999 2000 2001 2002 2003 2004
4 MB file. Server 100 Mbps. Client 1 Mbps Peer-Assisted Content Distribution Server Farm Desktop PCs can help each other! • Clients become new servers • Capacity increases with the number of clients • Limitless scalability and fast speeds at extremely low cost!!
Examples • Updates/Critical Patches • Adding large servers and egress capacity to absorb pick load is quite expensive • Alternative solution is to delay clients • Patches do not arrive on-time • Software Distribution • TV On-Demand. Movie/Music downloads • PodCasting • Enterprise content distribution
P2P Content Distribution • Benefits: • Dramatically improves speed • Limitless scalability • Minimum server requirements • Very cheap • Challenges: • Requires incentives for cooperation • Hard to ensure end2end full connectivity • Security • Manageability • Lack of locality increases transit costs for ISPs • Asymmetric links (traffic engineering) • Variable bandwidth, peers come and go • Need for more sophisticated distribution algorithms
Server 4 1 3 6 2 5 4 1 6 2 5 3 P2P Swarming • File is divided into many small pieces for distribution • Clients request different pieces from the server or from other clients • Clients become servers for those pieces downloaded • When all pieces are downloaded, clients can re-construct the whole file 4 1 6 5 2 3 [Rodriguez, Biersack, Infocom’00]
4 1 6 2 5 3 The Challenge • If there are many users, • deciding which is the best piece to • download can be very hard!! • Incorrect decisions result in low throughput, nodes not able to finish, bandwidth wasted, etc. Solutions that require to have full knowledge of who has what are non- scalable Server 4 1 3 6 2 5 4 1 3 6 5 2
Goal • Provide a very fast and robust Peer-Assisted solution for the distribution of legal content • Current problems in existing File Swarming solutions: • Rare-blocks are hard to obtain • Tit-for-tat incentive mechanisms decrease speeds • Arrival of new users slows down old users • Heterogeneous nodes do not interact well • Same information travels repeatedly over bottleneck links • Too much dependency from seeds • Sudden departures can prevent peers from finishing
The Problem of Efficient Scheduling of Information Source Block 1 Block 1 Block 2 Node C Node A Node B Block 1, or 2, or 12?
The Avalanche Magic • To solve problems of existing P2P file distribution solutions, Avalanche uses special encoding algorithms • Each encoded piece has the “DNA” of all pieces in the file. => A given encoded piece can be used by any peer in place of any piece • Encoded pieces are created using linear equations that involve all pieces in the file • Reconstructing the file requires collecting enough encoded pieces and solving the set of mathematical equations
Coding in general • Assume file: F = [x1 x2], where xi is a block. • Define code Ei(ai,1, ai,2) = ai,1*x1+ ai,2*x2, where ai,1, ai,2 are numbers. • “Infinite” number of Ei’s. • Any two linearly independent Ei(ai,1, ai,2) can recover [x1 x2]. • Similar as solving a system of linear equations. • Operations in finite fields [such as GF(216)].
Avalanche Coding File B1 B2 Bn Server b1 b2 a2 a1 an bn Client A E1 E2 w1 w2 Client B E3 • Content is encoded at the server • Clients can produce new encoded packets out of partial files [Chou et al., ’03]
Avalanche Robustness Avalanche Typical file-swarming systems If server suddenly goes down (after serving the full file one), all Avalanche users are able to complete the download. Only 10% of users using typical file-swarming techniques are able to complete.
Avalanche Download Time Finish Times Avalanche Typical swarming Peers using typical file-swarming techniques that did not finish. Nodes (sorted by order of arrival) => Much lower and predictable download times
No need for nodes to stay around… Finish Times Nodes stay for ever Nodes leave immediately Nodes (sorted by order of arrival) • With Avalanche, there is no need for nodes to stay after they finish the download to help other nodes (the performance remains unchanged)
Minimum Server Requirements Less than half the server requirements compared to systems based on current file-swarming techniques.
Decoding Performance Avalanche trades-off better speeds and less server load for more processing power at each node Note: Pentium III, 650MHz, 512MB RAM. Decoding time is less than 4% of the total download
Summary • Adding resources in an arbitrary fashion is not efficient or cost effective • We are witnessing a new Revolution • Peer-Assisted solutions can be used by content providers to provide hugely scalable, and very fast distribution of legal content at low cost