220 likes | 339 Views
Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh. by Dejan Kostic, Adolfo Rodriguez, Jeannie Albrecht and Amin Vahdat presented by Jon Turner. Introduction. Problem: large-scale data dissemination. focus on high bandwidth streaming media Current solutions
E N D
Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh by Dejan Kostic, Adolfo Rodriguez,Jeannie Albrecht and Amin Vahdatpresented byJon Turner
Introduction • Problem: large-scale data dissemination. • focus on high bandwidth streaming media • Current solutions • IP multicast – not complete solution and not deployed • overlay multicast – too dependent on quality of multicast trees • Proposed approach. • partial distribution using multicast tree • all data distributed, but each node gets only fraction from parent • peer-to-peer distribution of remainder • periodic distribution of information about data present at random subsets of entire group • nodes request missing data from peers • system combines number of elements developed earlier • erasure encoding (redundant tornado codes) • random subset distribution • informed content-delivery • TCP friendly rate control
Overview of Bullet Operation • Distribute data over tree • limited replication • uses bandwidth feedback • Nodes retrieve missing data from peers. • Periodic distribution of content availability info. • random subset of nodes • summary of their content • Limited set of peering relationships. • each node receives from limited set of senders • each node limits receivers • sets evolve over time to improve utility of peers • Data sent using TCP friendly rate control.
Data Distribution • System operates in series of epochs (typ. 5 seconds). • at start of epoch, nodes learn # of descendants children have • child i is assigned a sending factor sfiequal to its share of descendants (child with 20% of descendants has sfi=.2) • Nodes forward data received from parent to children. • packets are “assigned” to children according to their sf values • additional copies forwarded to children with spare bandwidth • use limiting factors, lfi to determine which children have spare bw • lf values are adjusted dynamically in (0,1] based on bw • Algorithm sketch – executed for each input packet p Find child t that has been assigned fewer than its share of packets. • attempt to send p to t (transport protocol blocks send if sending rate too high) • if attempt succeeds, assign p to t For each child c • attempt to send to c if no successful attempt yet or if c has spare bw • if attempt succeeds and this is first successful attempt, assign p to c • if attempt succeeds, but not first success, increase lfc • if attempt fails, decrease lfc
Finding Prospective Data Sources • At start of each epoch, each node receives information about data present at a random subset of peers. • summary ticket describing data present at each peer from current “working set” • by comparing peer summary tickets to its own, node determines similarity of peers’ data to its data • select new peers with dissimilar data sets • limit on number of concurrent senders • discard senders that have been supplying too few new packets • creates space in sender list for new sender • Computing summary ticket • if node has packets i1,i2,...,in • let tj=min{fj(i1), fj(i2),...} for 1≤j≤k where fj(ik)=(aj ik+bj) mod U • summary ticket is (t1, t2,....,tk) • each tj depends on the entire sequence • similarity of two tickets is fraction of values in common
Recovering Data From Peers • Node periodically supplies its senders with a representation of the set of packets it currently has. • limited to current “working set” (range of sequence numbers) • set is represented by a Bloom filter • each sender is also assigned a fraction of the working set • packets with sequence numbers equal to i mod s where s is number of senders • Senders transmit missing packets using available bw. • packets checked against Bloom filter • don’t send if packet’s sequence number is in Bloom filter • because senders selected for dissimilarity, most packets should pass check • Example of numerical parameters. • 5 second epoch, 30 second working set, 500 Kb/s=50 p/s,10 senders • so asking each sender for at most 150 packets • simpler and possibly more efficient to use bit vector
S=subset of {u}S1...Skn=1+n1+...+nk D,N u u D1=subset of {u}DS2...SkN1=1+N+n2+...+nk S1,n1 S2,n2 Sk,nk Distributing Random Subsets • Objective – give each node a random sample of other nodes in the tree, excluding its descendants. • Two phases • collect phase – propagate random subsets up the tree • distribute phase – propagate random subsets back down tree • mix subset received from parent with child subsets
Other Features • Formation of overlay tree. • not a focus of this work • paper cites variety of previous tree construction algorithms • argues that use of mesh makes tree quality less important • most results use random tree • Data encoding • not a focus of this work • suggests use of erasure codes or multiple-description codes • reported results neglect encoding • Transport protocol • unreliable version of TFRC • adjusts sending rate to match fair-share bandwidth based on detected loss probability • feedback from transport sender used by Bullet to adjust rates at which it attempts to send to children
Performance Evaluation • Most results emulated in ModelNet (Duke network emulation system, similar to Emulab) • uses 50 machines (2 GHz P4s running Linux) to emulate 1,000 node overlay • 20,000 node network topology generated using INET • link delays determined by geographic separation • random subset of network “client” nodes selected for overlay • random node in overlay selected as root • Link bandwidths • four classes of links • three scenarios
bottleneck bandwidth tree random tree Tree Quality for Streaming Data bottleneck bandwidth tree constructed heuristically to have no small capacity links(off-line construction) point is to show that bottleneck tree is much better than random tree used by Bullet.
raw total useful total ±std. deviation from parent Baseline Bullet Performance average better than bottlneck tree for streaming case not too much excess traffic 3s point suggests a few percent get less than half average most data from mesh
Cum. Dist. of Received Bandwidth median about 525 Kb/s about 100 nodes get <80% of median about 50 nodes get <70% of median
high bandwidth medium BulletBottleneck low Comparing Bullet to Streaming when plenty of bandwidth both do well Bullet does better when bandwidth limited much better when bandwidth scarce
raw total useful total from parent Effect of Oblivious Data Distribution tree nodes attempt to forward all received packets to all children average throughput drops by 25%
Bullet vs “Epidemic Routing” • Push gossiping • nodes forward non-duplicate packets to randomly chosen set of peers in their local view • packets forwarded as they arrive • no tree required • Streaming with anti-entropy • data streamed over multicast tree • gossip with random peers to retrieve missing data • anti-entropy (?) used to locate missing data • apparently, this means periodically select a random peer and send it list of missing packets, peer supplies missing packets it has • Experiments done on 5000 node topology. • no physical losses • link bandwidths chosen from medium range
push gossipingstreaming w/AEbullet raw useful bulletpush gossipingstreaming w/AE Bullet vs “Epidemic Routing” gossiping has high overhead,bullet and streaming w/AE are comparable bullet outperforms epidemic routing
Bullet – high bw, medium bottleneckhigh, medium Bullet – lowbottleneck – low Performance on Lossy Links non-transit links lose 0 to .3% of packetstransit links lose 0 to .1%5% of links lose 5 to 10% Bullet dramatically better than streaming over bottleneck tree topology
new peers enabled new peers disabled bandwidth received bandwidth received useful total useful total from parent from parent Performance with Failing Node adding new peers eliminates degradation degraded throughput using existing peers • Child of root with 110 descendants fails at time 250. • Assume no underlying tree recovery. • Evaluate effect of failure detection and establishment of new peer relationships
Bullet good tree worst tree Bullet Performance on Planet Lab • 47 nodes with 10 in Europe, including root. • Compare to streaming over hand-crafted trees. European nodes all near tree root select children of root to have poor bandwidth from root,recurse
Related Work • Peer-to-peer data dissemination • Snoeren et. al. – 2001 • Fast Replica – Cherkasova & Lee – 2003 • Kazaa and Bit Torrent • Epidemic data propagation • pbcast – Birman et. al. – 1999 • lbpcast – Eugster et. al. – 2001 • Multicast • Scalable Reliable Multicast – Floyd et. al. – 1997 • Narada – Chu et. al. – 2000 • Overcast – Janotti et. al. – 2000 • Content streaming • Split Stream – Castro et. al. – 2003 • CoopNet – Padmanabhan et. al. – 2003
Conclusions • Overlay mesh is superior to streaming over overlay multicast tree. • Bullet demonstrates this and includes some novel features. • method of distributing data to subtrees to equalize availability • scalable method of finding peers who can supply missing data • Large-scale performance evaluation • supports claims for Bullet’s performance advantages • explores performance under variety of conditions
Discussion Questions • What are the contributions? • what’s novel? what’s borrowed? • does it represent an improvement? how much? • are the authors’ claims justified? • for what applications might system be useful? • Are the authors’ design choices adequately justified? • method for distributing data over tree? • acquiring information about peer data? • use of summary tickets? use of Bloom filters? use of TFRC? • Is performance evaluation satisfactory? • what about other network topologies, link bandwidths? • what about different group sizes? tree characteristics? • why no detailed examination of specific design choices? • why isn’t cross traffic emulated? • what about processing requirements? what’s the average number of cycles executed per packet received? • is it repeatable by others?