180 likes | 274 Views
Informed Content Delivery Across Adaptive Overlay Networks. John Byers Dept. of Computer Science, Boston University www.cs.bu.edu/~byers Joint work with Jeffrey Considine, Michael Mitzenmacher and Stanislav Rost. Build distribution topology out of unicast connections (tunnels).
E N D
Informed Content Delivery Across Adaptive Overlay Networks John Byers Dept. of Computer Science, Boston University www.cs.bu.edu/~byers Joint work with Jeffrey Considine, Michael Mitzenmacher and Stanislav Rost
Build distribution topology out of unicast connections (tunnels). Requires active participation of end-systems. Native IP multicast unnecessary. Saves considerable bandwidth over N * unicast solution. Basic paradigm easy to build and deploy. SOURCE • Bonus: Overlay topology can adapt to network conditions by self-reconfiguration. Overlays for Content Delivery
Use of Overlays • Killer apps: • Millions of users want to download a new movie watch the SIGCOMM technical sessions. • CDNs want to populate thousands of servers with new movies for those users. • Research directions to date: • Considerable effort on optimizing overlay layout (Narada, Overcast, RON, etc.). • Scalable solutions for indexing/locating content using overlays (CAN, Chord, etc.). • Our focus: • Maximize throughput of large transfers across overlays.
Limitations of Existing Schemes • Tree-like topologies • Rooted in history (IP Multicast) • Limitations: • bandwidth decreases monotonically from the source • losses increase monotonically along a path • Does this matter in practice? • Anecdotal and experimental evidence says yes: • Downloads from multiple mirror sites in parallel[BLM ’99, RKB ’00] • Availability of better routes [SCHSA ’99, ABKM ’01]. • Peer-to-peer: Morpheus, Kazaa and Grokster.
2 3 2. Harnessing the power of parallel downloads. 3. Incorporating collaborative transfers. An Illustrative Example 1 1. A basic tree topology.
Our Philosophy • Go beyond trees. • Use additional links and bandwidth by: • downloading from multiple peers in parallel • taking advantage of “perpendicular” bandwidth • Has potential to significantly speed up downloads… • But only effective if: • collaboration is carefully orchestrated • methods are amenable to frequent adaptationof the overlay topology
Suitable Applications • Prerequisite conditions: • Available bandwidth between peers. • Differences in content received by peers. • Rich overlay topology. • Applications • Downloads of large, popular files. • Video-on-demand or nearly real-time streams. • Shared virtual environments.
Erasure Codes • We typically think of data as an ordered stream. I need packets 1-1,000. • Using erasure codes, data is like water: • Can generate a pool of redundant data from full original content. • You don’t care what droplets you get. • You don’t care if some spills. • You just want enough to get through the pipe.I need any 1,000 packets. • The digital fountain model [BLMR ’98] is ideal for use in a fluid overlay environment.
Erasure Codes Offer Freedom • Intrinsic resilience to packet loss, reordering. • Better support for transient connections via stateless migration, suspension. • Peers with full content can always generate useful symbols. • Peers with partial content are more likely to have content to share. • But using erasure codes comes at a price: • Content is no longer an ordered stream. • Therefore, collaboration is more difficult.
Informed Content Delivery:Definitions and Problem Statement • Peers A and B have working sets of symbols SA,SBdrawn from a large universe U and want to collaborate effectively. • Key components: • Summarize: Furnish a concise and useful sample of a working set to a peer. • Approximately Reconcile: Compute as many elements in SA - SBas possible and transmit them. • Do so with minimal control messaging overhead.
Min-Wise Summaries Problem: Neighboring peers may have similar content. Solution: Give peers a “calling card” (fits in 1 packet) to summarize the content they have, check similarity.
Recoding Problem: What to transmit when peers have similar content? Solution: Allow peers to probabilistically “hedge their bets,” minimizing chance of transmission of useless content. Example: Suppose the resemblance between SAandSBis 0.9.If Asends a symbol at random the probability of it being useful to B is 0.1. A better strategy is to XOR 10 random symbols together. B can extract one useful symbol with probability:10 x (1/10) x (9/10)9 > 1/e 0.37
Approximate Reconciliation Trees Problem: Collaborating peers have overlapping content. Solution: Efficient data structures for reconciliation.
symbols received - symbols needed symbols needed Experimental Scenarios • Three methods for collaboration • Uninformed: A transmits symbols at random to B. • Speculative: B transmits a minwise summary to A; A then sends recoded symbols to B. • Reconciled: B transmits a digest of its set to A; A then sends packets from the set difference. • Overhead: • Decoding overhead: with erasure codes, fixed 2.5%. • Reception overhead: useless duplicate packets. • Recoding overhead: useless recoding packets.
|SA Ç SB| |SB| Containment of B in A: Pairwise Reconciliation 128MB file 96K input symbols 115K distinct symbols in system initially
|SA Ç SB| |SB| Containment of B in A: Four peers in parallel 128MB file 96K input symbols 105K distinct symbols in system initially
|SA Ç SB| |SB| Containment of B in A: Four peers, periodic updates 128MB file 96K input symbols 105K distinct symbols in system initiallyDigests updated at every 10%.
Conclusions • Even with ultimate routing topology optimization, the choice of what to send is paramount to content delivery. • Digital fountain model ideal for fluid and ephemeral network environments. • Richly connected topologies are key to harnessing perpendicular bandwidth. • Wanted: more algorithms for intelligent collaboration.