150 likes | 319 Views
Structure Management for Scalable Overlay Service Construction. Kai Shen Department of Computer Science University of Rochester. Motivations. Structure: set of overlay links that data flow through link selection is important for performance low latency, high bandwidth, …
E N D
Structure Management for Scalable Overlay Service Construction Kai Shen Department of Computer Science University of Rochester NSDI'04
Motivations • Structure: set of overlay links that data flow through • link selection is important for performance • low latency, high bandwidth, … • link selection can be costly • large selection base • high cost of link property probing • Existing link selection are mostly service-specific • unicast overlay path selection (e.g., RON) • end-system multicast (e.g., Narada, Overcast, and NICE) • substrate-aware DHT (e.g., CAN, Chord, and Pastry) NSDI'04
A Common Structure Layer • A service-independent structure layer: Saxons • Substrate-Aware Connectivity Support for Overlay Network Services • Potential benefits: • simplify service design & implementation • modularity • allow runtime overhead sharing across multiple services (not yet addressed in this paper) • Questions: • Performance? • How can services utilize a common structure layer? NSDI'04
Design Objectives • A common structure layer must meet the quality requirements of a wide range of services • overlay latency • hop-count distance • overlay bandwidth: on the shortest path, or on the widest path Best effort: no guarantee on structure quality • Other design objectives: • scalability • extremely simple API • stability NSDI'04
Saxons Design Overview Structure quality maintenance Node bootstrap Partition detection & repair Like property probing Membership management • Scalability • functional-symmetric architecture • per-node management overhead only depends on the number of attached links; not the overlay size • do not maintain complete system view at any single node NSDI'04
Structure Quality Maintenance • Configurable node degree range <dactive, dtotal> • High-level description • periodically check random links; replace existing ones if better • employ adjustment threshold to avoid oscillation • Three quality maintenance approaches • AllShort: maintain all short links • tend to create grid-like structure ⇒ high hop-count distance ( vs. O(log n) produced by random structure) • ShortLong [Ratnasamy et al 2002]: half short, half random links • ShortWide: half short, half wide links (high adj. threshold) NSDI'04
Random Membership Subsets • Membership subset service • dynamically changing subsets with uniform randomness • for tree-like overlay structures [Kostić et al 2003] • Each node maintains a member-subset • Periodically, each node informs its neighbors a randomly selected update-set • To ensure equal representation • the node itself is selected into each update-set at probability: (update-set size) / (overlay size) NSDI'04
Implementation • Saxons runtime prototype • stand-alone daemon communicating with local overlay application instances through IPC; or • linked and run inside the application process space • Basic API for overlay applications: • directly query the Saxons runtime for directly attached links • provide a callback function to the Saxons runtime, invoked by Saxons whenever neighbor links change • Advanced API: • control protocol parameters NSDI'04
Link Bandwidth Measurement • Requirements: robustness, overhead, accuracy • Many techniques were proposed in the past • Our goal: a simple scheme that works • based on the packet bunch[Carter&Crovella 1996] 4.8MB/measurement 2 10 480KB/measurement All-to-all measurement results on 61 Planetlab sites: 1 Bandwidth (in Mbps) 10 0 10 NSDI'04
Evaluation • Simulation • evaluation on large-scale overlays (up to 12,800 nodes) • use 3 kinds of Internet backbones • BGP routing dumps from NLANR and RouteViews • synthetic backbones generated using Inet and GT-ITM • based on all-to-all measurement results from NLANR AMP • PlanetLab experimentation • performance assessment on a particular real-world environment • most nodes are on Internet2 • most nodes have 10Mbps bandwidth limit NSDI'04
Overall Structure Quality(55 PlanetLab sites) Random AllShort ShortLong ShortWide (Saxons) CDF of overlay path latency CDF of widest path bandwidth 100% 100% 80% 80% 60% 60% 40% 40% 20% 20% 0% 0% 1.25 2.5 5 10 20 40 80 0 100 200 300 Latency (in millisecond) Bandwidth (in Mbps) • All three schemes outperform Random by over 18% on latency • ShortWide provides >10Mbps bandwidth for over 3 times more site pairs NSDI'04
Structure Stability During Node Churn(55 PlanetLab sites) Overlay link adjustment during node join/departure 60 ¬ 5 sites fail 50 ¬ Site #1 rejoins ¬ All sites complete bootstrap ¬ Site #2 rejoins 40 ¬ Site #3 rejoins Adjustment per hour per node 30 ¬ Site #4 rejoins ¬ Site #5 rejoins 20 10 0 0 20 40 60 80 100 120 Time after all sites have joined (in minutes) • Five nodes fail at the 60th minute and rejoin one by one at three-minute intervals • Small disturbances as the result of node join/departure NSDI'04
Saxons-based Overlay Multicast(52 PlanetLab sites) Bandwidth for 1.2 Mbps stream Bandwidth for 2.4Mbps stream 2.5 1.2 2 1 0.8 1.5 Bandwidth (in Mbps) Bandwidth (in Mbps) 0.6 1 0.4 0.5 0.2 0 0 0 10 20 30 40 50 0 10 20 30 40 50 Rank Rank Multicast over Random Multicast over Saxons Independent direct unicast • Compared with Random, Saxons-based multicast provides small-loss (<5%) data delivery to over 4 times more receivers • Performance close to Independent Direct Unicast NSDI'04
Related Work • Structure-first overlay multicast: Narada[Chu et al 2000] • Utilities/infrastructures for overlay service construction: • Topology probing [Nakao et al 2003], MACEDON [Rodriguez et al 2004] • Service-specific link selection: • Unicast routing: RON [Andersen et al 2001], [Savage et al 1999] • Multicast routing: Narada [Chu et al 2001], Overcast [Jannotti et al 2000], NICE [Banerjee et al 2002] • Substrate-aware DHT: Binning [Ratnasamy et al 2002], Brocade [Zhao et al 2002], Pastry [Castro et al 2002] • Related work for various Saxons components • Membership management:[Kostić et al 2003], lpbcast [Eugster et al 2003] • Bandwidth measurement: [Carter&Crovella 1996], [Paxson 1997], [Lai&Baker 2000] • Scalable latency estimation: [Hotz 1994], IDMaps[Francis et al 1999], GNP[Ng&Zhang 2002] NSDI'04
Conclusion and Future Work • Saxons - a common structure management layer supporting scalable overlay service construction • simplify the construction of many overlay services • still allow many services (e.g., overlay multicast, Gnutella-style query flooding, DHT) to achieve high-performance • Future work: • support runtime overhead sharing when overlay nodes host multiple services • best effort structure quality →soft structure quality guarantee NSDI'04