1 / 21

SplitStream: High-Bandwidth Multicast in Cooperative Environments

Learn about SplitStream, a peer-to-peer system for efficient high-bandwidth multicast in cooperative environments, without assuming dedicated infrastructure. It balances loads over peers, accommodates limitations, and ensures robustness to failures. Built on Pastry and Scribe, SplitStream divides data into stripes and uses multiple trees to spread load evenly. Explore the approach, analyze spare capacity groups, and understand the algorithm's correctness and complexity. Experiments show its effectiveness in practice.

Download Presentation

SplitStream: High-Bandwidth Multicast in Cooperative Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Marco Barreno Peer-to-peer systems 9/22/2003 SplitStream: High-Bandwidth Multicast in Cooperative Environments

  2. Background • Tree-based multicast • High demand on few internal nodes • Cooperative environments • Peers contribute resources • We don't assume dedicated infrastructure • Different peers may have different limitations

  3. Goals of SplitStream • Balance load over peers • Accommodate different limitations • Each node has a desired indegree and a forwarding capacity (max outdegree) • Be robust to failures

  4. The SplitStream approach • Split data into stripes, each over its own tree • Each node is internal to only one tree • Built on Pastry and Scribe • Recall that Pastry uses prefix routing

  5. Scribe background • Built on top of Pastry • Any Scribe node may create a group • Other nodes may join group or send multicast • Node with nodeId numerically closest to groupId is the rendezvous point • Root of multicast tree for the group • Joins handled locally • But it's only a single tree

  6. Stripes • SplitStream divides data into stripes • Each stripe uses one Scribe multicast tree • Prefix routing ensures property that each node is internal to only one tree • Inbound bandwidth: can achieve desired indegree while this property holds • Outbound bandwidth: this is harder—we'll have to look at the node join algorithm to see how this works

  7. Respecting forwarding capacity • The tree structure described may not respect maximum capacities • Scribe's push-down fails to resolve the problem because a leaf node in one tree may have children in another tree

  8. Compare this to Overcast • Overcast also creates an overlay to spread multicast work around, but... • Overcast is single-source, while SplitStream is multi-source • Overcast uses a single tree, while SplitStream uses multiple trees • Overcast is designed to maximize bandwidth between root and leaves, while SplitStream is designed to spread load evenly to all nodes (including leaves)

  9. Parent location algorithm • Node adopts prospective child • If too many children, choose one to reject: • First, look for one in stripe without shared prefix • Otherwise, select node with shortest prefix match • Orphan locates new parent in up to two steps: • Tries former siblings with stripe prefix match • Adopts or rejects using same criteria; continue push-down • Use the spare capacity group

  10. The spare capacity group • If orphan hasn't found parent yet, anycasts to spare capacity group • Group contains all SplitStream nodes with fewer children than their forwarding capacity • Anycast returns nearby node, which starts a DFS of the spare capacity group tree, sending first to a child...

  11. Spare capacity group (cont.) • At each node in the search: • If node has no children left to search, check whether it receives a stripe the orphan seeks • If so, verifies that the orphan is not an ancestor (which would create a cycle) • If both tests succeed, the node adopts the orphan • May leave spare capacity group • If either test fails, back up to parent (more DFS...)

  12. A spare capacity example

  13. Consequences • Parent is likely to be physically near orphan due to locality of Pastry and Scribe • However, it is possible for the parent already to be an internal node for another stripe • If this parent fails it will bring down two stripes • Anycast can still fail • Adding the orphan may cause a cycle (fixable) • No node with spare capacity provides stripe sought • Declare failure and notify the application

  14. Correctness and complexity • Big assumptions: • All nodes join at the same time and communication is reliable • Nodes do not leave the system either voluntarily or due to failures • Splitstream can deal with violations of either, but problems may arise that prevent the forest from being constructed • Simulation shows this isn't problematic in practice

  15. Correctness and complexity (2) • A fairly lengthy analysis reveals this rough upper bound on the probability that the algorithm fails to build a feasible forest: But when the desired indegree of all nodes equals the total number of stripes, the algorithm never fails

  16. Correctness and complexity (3) • Expected state maintained by each node is O(log|N|) • Expected number of messages to build forest is O(|N|log|N|) if trees are well balanced and O(|N|2) in the worst case • Trees should be well balanced if each node forwards its own stripe to two other nodes

  17. Experiments

  18. Experiments (2)

  19. Experiments (3)

  20. Experiments (4)

  21. Conclusions • So what are the major points? =)

More Related