1 / 47

Informed Content Delivery Across Adaptive Overlay Networks

Presented by Kelly Whitacre Written by John W. Byers, Jeffrey Considine, Michael Mitzenmacher , Member, IEEE, and Stanislav Rost. Informed Content Delivery Across Adaptive Overlay Networks. Problem.

trevorbrown
Download Presentation

Informed Content Delivery Across Adaptive Overlay Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presented by Kelly Whitacre Written by John W. Byers, Jeffrey Considine, Michael Mitzenmacher, Member, IEEE, and Stanislav Rost Informed Content Delivery Across Adaptive Overlay Networks

  2. Problem Distributing a large new file across the Internet to millions of users simultaneously has proven to be challenging

  3. Possible Solution: Point-to-Point? Wasted Bandwidth Limited Transfer Rates • Having individual point-to-point connections from a single source wastes bandwidth • Server must handle load of possible many clients • Bandwidth costs money • Server should utilize available Bandwidth • Transfer rates are limited by the characteristics of the end-to-end paths

  4. Possible Solution: IP Multicast? Pros Cons • Solves bandwidth problems of point-to-point • Server sends one copy • Network handles the rest • No flow control • No retransmission of lost packets • Limited deployment

  5. Reliable Multicast • Digital fountain approach • Erasure codes—sends parity information with packets to recover lost (no feedback channels are needed to ensure reliable delivery) • Recirculation—information is re-circulated (fountain) for asynchronous client arrivals • Parallel Transfer rates—heterogeneous client transfer rates so as to not flood network

  6. Digital Fountain Approach k Source Instantaneous Encoding Stream Transmission k Received Instantaneous k Message Can recover file from any set of k encoding packets. Source: http://www.sigcomm.org/sigcomm98/tp/abs_05.html

  7. 0 hours 1 hour 2 hours 3 hours 4 hours 5 hours Digital Fountain Approach Transmission File User 1 User 2 Source: http://www.sigcomm.org/sigcomm98/tp/abs_05.html

  8. Cyclic Interleaving Transmission Encoded Blocks Interleaved Encoding Blocks Encoding Copy 1 File Encoding Copy 2 Tornado Encoding Source: http://www.sigcomm.org/sigcomm98/tp/abs_05.html

  9. Solution: Adaptive Overlay Networks Source: http://www.cs.virginia.edu/~mngroup/hypercast/designdoc/Chp1-Overview/Chp1-Overview.html

  10. Adaptive Overlay Networks Differs from IP Multicast • Do not use Multicast tree • Flexibly adapt to changing network conditions • End systems are explicitly required to collaborate! • Can improve performance by additional cross-connections and active collaboration

  11. Addressing Limitations: Content Delivery Scenario Consider: Initial Delivery Tree S = Source Shaded Area = each node has a working set of packets, the subset of packets it has received

  12. Addressing Limitations: Improving Transfer Rates Harnessing the Power of Parallel Downloads Tree Directed Acyclic Graph Establishing concurrent connections to multiple servers or peers with complete copies of the file

  13. Addressing Limitations: Improving Transfer Rates Harnessing the Power of Collaborative Transfer Establishing concurrent connections to multiple peers

  14. Addressing Limitations: Improving Transfer Rates Power of Cross-Connections & Collaboration (d) depicts the portions of content which can be beneficially exchanged via pair-wise transfers

  15. Considerations • (a) & (b) impede the full flow of content to downstream receivers • Opportunistic connections of (c) & (d) allow for higher transfer rates • Yet, demand more careful orchestration between end systems • Must determine set difference of working sets • Reconciliation is simple in working sets limited to small contiguous blocks • Limits flexibility of frequent changes that arise in AON

  16. Content Delivery Across Adaptive Overlay Networks Challenges Stateful vs. Non-Stateful Solutions

  17. Adaptive Overlay Networks in a Fluid Internet Challenges … Need to … • Asynchrony • Receivers may open and close connections or leave and rejoin the infrastructure at arbitrary times • Heterogeneity • Connections vary in speed and loss rates • Transience • Routers, links, and end systems may fail and their performance may fluctuate over time • Scalability • The service must scale to large receiver populations and large content • Adaptively detect and avoid congested or temporarily unstable areas of the network • Dynamically establish paths with the most desirable end-to-end characteristics • Deliver useful content, often in parallel with a minimum of setup overhead and message complexity

  18. Limitations of Stateful Solutions Addresses A significant per-connection state • Issues of connection • Connections that vary in speed and loss rates • Clients coming and going at arbitrary times • Is highly unscalable • May impact performance • state must be maintained in the face of reconfiguration and reconnection • With parallel downloading is problematic

  19. Alternative: Encoded Content through Digital Fountain Approach • Digital Fountain Approach • Resilience to packet loss—erasure-correcting code • Guarantee • Claims : recover the original source file from any subset of distinct symbols in the encoding stream equal to the size of the original file • In practice : recover a file from a few percent more than the number of symbols in the original file

  20. Encoded Content through Digital Fountain Approach Pros • Continuous Encoding • Senders with a complete copy of a file may continuously produce fresh encoding symbols • Time Invariance • New encoding symbols are produced independently from symbols produced in the past • Tolerance • Digital fountain streams are useful to all receivers regardless of the times of their connections or disconnections and their rates of sampling the stream • Additivity • Parallel downloads from multiple servers with complete copies of the content require no orchestration Stateless!

  21. Encoded Content through Digital Fountain Approach Cons • Encoding/Decoding Overhead • Reconciliation methods are needed for those collaborating end systems have only a portion of the content

  22. Reconciliation and Informed Delivery Coarse-grained reconciliation Speculative transfers Fine-grained reconciliation

  23. Note: Approaches proposed are local in scope and typically involve a pair or a small number of end systems Goal is to provide the most cost-effective reconciliation mechanisms measuring cost both in computation and message complexity

  24. Coarse-Grained Reconciliation • Estimate resemblance working sets of pairs of nodes prior to establishing connections • Quick estimates of the fraction of symbols common to the working sets of both peers • Approach 1: Employs Random Sampling • Approach 2: Employs sketches of each peer’s working set • High-level information • Lightweight, computed efficiently • Incrementally updated • Fit into a single 1-kB packet

  25. Notation & Framework • Let peers A and B have working sets SAand SB containing symbols from an encoding of the file • Containment • The containment of B in A is the quantity • Resemblance • The resemblance of A and B is the quantity

  26. Notation & Framework • Each element of a working set is identified by an integer key (sending an element entails sending its key) • Keys are distributed over the key space uniformly at random • With 64-bit keys, a 1-kB packet can hold roughly 128 keys • Can be the same • If the elements are determined by a hash function seeded by the key, two keys may generate the same element with small probability • Minimal impact

  27. Random Sampling Select elements of the working set at random and transport those to the peer.

  28. Random Sampling Pros Cons • Unbiased estimate of containment • Can be incrementally updated using reservoir sampling • Must search its own working set for each element in random set • Do not easily allow one peer to check the resemblance between prospective peers • A cannot check resemblance between B & C

  29. Min-Wise Sketches Calculates working set resemblance based on min-wise sketches

  30. Min-Wise Sketches ∏i represents a random permutation on the key universe A sends B a vector of A’s minima (elements that lie in both sets) B Counts the number of positions where the two are equal Divides by the total number of permutations The result is an unbiased estimate of the resemblance

  31. Min-Wise Sketches Pros Cons • Unbiased estimate of resemblance • Allows similarity comparisons given any two sketches for any two peers • A can check resemblance between B and C • Truly random permutations cannot be used • Storage requirements are impractical • Possibility of false positives • ∏i values are hashed to fewer bits to allow for more sketch elements in packet • (Details not discussed)

  32. Speculative Transfers • Involve a sender performing “educated guesses” as to which symbols to generate and transfer • Send symbols which are probably useful to the other • This process can be fine-tuned using the results of coarse-grained reconciliation

  33. Speculative Transfers • When containment of B in A is low, speculative transfers is trivial since most of B’s symbols are useful to A • When containment of B in A is high, strategy is inefficient—use recoding

  34. Recoding • A recoding symbol is simply the bitwise XOR of a set of encoding symbols • Must be accompanied by a specification of the encoding symbols blended to create it • Must explicitly list the random seeds of the encoding symbols from which it was produced

  35. Encoding/Decoding Recoding Symbols • Similar to the substitution rule • Example—peers with y5, y8, y13 generate recoding symbols: • Z1 = y13 • Z2 = y5 XORy8 • Z3 = y5 XOR y13 • Peer receives Z1, Z2, Z3 can recover y13 • By substitution recover y5 & y8

  36. Fine-grained Reconciliation • Is a set-difference problem • Tries to determine the exact difference of SA- SB • Many approaches • Polynomial-Based • Enumeration-Based • Bloom filter • Search-Based • Approximate Reconciliation Trees (ART) which combine the compact representation of Bloom filters with the speed of a search-based approach

  37. Bloom Filter • A set of n elements that represent the working set calculated by independent random hash functions • Flow • Peer A sends B a Bloom filter FA of SA • Peer B then checks for each element of SBin FA • Peer B has determined SA- SB • This solution is effective particularly when the number of differences is a large fraction of the set size

  38. Experimental Results Demonstrate the benefits and costs of using reconciliation in peer-to-peer transfers and in parallel downloads

  39. Simulation Parameters • All consider transfer of a 128-MB file • Origin server • Divides this file into input symbols of 1400 bytes each (fit it in an Ethernet packet with headers) • Encodes this file into a large set of encoding symbols • Associate each encoding symbol with a 64-bit identifier representing the set of input symbols used to produce it • Min-wise sketches used 180 permutations, yielding 180 entries of 64 bits each for a total of 1440 bytes per summary • Bloom filters used 6 hash functions and 8(1 + 0.0025)L bits for a total of 96 kB per filter

  40. Collaboration Methods • Uninformed • The sending peer picks a symbol to send at random • Speculative • The sending peer uses a min-wise sketch from the receiving peer to estimate the containment • Reconciled • The sending peer uses either a Bloom filter or an ART from the receiving peer to filter out duplicate symbols and sends a random permutation of the differences.

  41. Scenarios and Evaluation • Varying 3 experimental factors: • Set of connections in the overlay formed between sources and peers • Distribution of content among collaborating peers • Slack of the scenario (1.1 & 1.3) • When smaller than (1+ decoding overhead), the set of peers will be unable to recover the file • When larger than (1+decoding overhead), the set of peers will most likely recover the file • Methods provide the most significant benefits over naive methods when there is only a small amount of slack

  42. Scenario 1: Two peers with Partial Content • One peer sends symbols to the other % of Shared Encoding Symbols • Uninformed collaboration performs poorly and degrades significantly as the containment increases • Speculative collaboration is more efficient, but the overhead still increases slowly with containment • Overhead of reconciliation is purely from the cost of transmitting a Bloom filter or ART (less than a %)

  43. Scenario 2: Download from a Server with Complete Content • With concurrent transfer from a peer % of Shared Encoding Symbols • Uninformed collaboration overhead is considerably lower than in the scenario 1 (larger fraction of the content is sent directly via fresh symbols from the server) • Speculative collaboration performs similarly to scenario 1 • Reconciled collaboration has overhead slightly higher than receiving symbols directly from the server

  44. Scenario 3: Parallel Download from Peers with Partial Content • Collaborating With Multiple Peers in Parallel % of Shared Encoding Symbols • Can leverage bandwidth from peers with partial content with only a slight increase in overhead • Uninformed collaboration performs extremely poorly • Speculative collaboration dramatically improves as containment increases • Reconciled collaboration has much higher overhead than before

  45. Conclusions • Adaptive overlay networks offer a powerful alternative to traditional mechanisms for content delivery • Flexibility, scalability, and deploy-ability. • Informed and effective collaboration between end systems can be achieved through the digital fountain approach • Care is needed to provide methods for representing and transmitting the content in a manner that is as flexible and scalable as the underlying capabilities of the delivery model

  46. Questions?

  47. Supplemental Reading and Resources • A Digital Fountain Approach to Reliable Distribution of Bulk Data http://www.ecse.rpi.edu/Homepages/shivkuma/teaching/sp2001/readings/digital-fountain.pdf • ACM SIGCOM ’98, A Digital Fountain Approach to Reliable Distribution of Bulk Data http://www.sigcomm.org/sigcomm98/tp/abs_05.html

More Related