Topologically-Aware Overlay Construction and Sever Selection

Topologically-Aware Overlay Construction and Sever Selection Sylvia Ratnasamy, Mark Handley, Richard Karp, Scott Shenker

Motivation • Constructing overlay by incorporating physical topology into the logical topology • Selecting a good sever in content distribution and P2P file sharing by considering the physical topology

1 4 3 2 • (Images downloaded from http://www.mapresources.com/photoshop_maps/) Topology-aware Overlay 1 2 4 3

Topology-aware Overlay 1 4 1 2 3 2 4 3 The logical structure of the overlay should take into account the physical structure of the underlying network! • (Images downloaded from http://www.mapresources.com/photoshop_maps/)

Outline • Motivation • Binning Scheme • Applying Binning Scheme

Design Consideration • Desirable Properties: Practicality and Scalability • Simple • Fast converge to a good state • Distributed – no central point of failure or bottleneck • Scalable – for millions of nodes • Priorities: (Scalability + Practicality) > Accuracy

Method • Network Measurement used: Network Latency • Non-intrusive • Light-weight • End-to-end • Binning Scheme

Distributed Binning • Clustering the nodes by a set of landmark machines spread across the internet. • Nodes measure RTT to each of these landmarks and orders the landmarks in increasing RTT. • Divide the range of possible latency values into levels

Distributed Binning Example

Discussion • Will Binning Scheme effect distributed, scalable properties? • Given each node computing the annotation, who do the clustering? • Where is the clustering result(approximate physical topology) stored?

Scalability Every node will ping all landmarks to refresh the topology. • At a million nodes on the network, refreshing at every hour, each landmark would approximately handle 2700pings/sec. • How to guarantee balance visits? • Better scalability by have multiple nodes at a location act as a single logical landmark.

Performance Experiment Set • Measurement • For each node in a bin compute Gain ratio = inter-bin latency / intra-bin latency • Ratio = Reduction in Latency = Desirable • Data Set • Transit Stub (1,000 and 10,000 nodes) • Power-law Random Graphs (1,166 and 1,779 nodes) • NLANR(103 nodes) • Assumption • The landmark machines is separated form each other by 4 hops

Increasing Number of Levels • Gain Ratio is improved with level increasing • Improvement rapidly saturates

Increasing Number of Landmarks • Gain Ratio is improved with landmarks increasing • Improvement rapidly saturates except TS-10k

Binning Vs (Random, Nearest-Neighbor) • Random Binning: Each node selects a bin at random. • Nearest Neighbor Clustering: At each iteration, two closest clusters are merged into a single cluster.

Discussion • Is gain ration a reasonable way to measure the performance of binning scheme? • What is the effect of increasing the nodes. • Is the assumption too strong for the experiment data, that the landmark machines is separated form each other by 4 hops?

Outline • Motivation • Binning Scheme • Applying Binning Scheme

Applying Binning Scheme • Construction of Overlays • Structured: Nodes are interconnected (at application-level) in a well-defined manner. Content-Addressable Network, Chord, PASTRY, Tapestry • Unstructured: Less structured networks End-system. Multicast, Scattercast • Sever Selection

Construction of Overlays • Measurement • Latency Stretch: ratio of average inter-node latency on the overlay network to the average inter-node latency on the underlying IP-level network. • Latency Stretch = Better!

Construction of CAN • Only ordering of landmarks is used for binning so that there are m! orderings for m landmarks. • Build a m dimensions cube. Each dimension has m, m-1, …, 1 elements. Each point in the cub is correspondent to one order. • New node joins CAN at the portion associated with its landmark ordering.

Side Effect • Co-ordinate space not uniformly populated • The average number of hops on the path between two points decrease. ?

Discussion • Overlay nodes >> physical nodes ? • Given each node in CAN can store multiple network nodes, how to store and change the CAN topology.

Construction of Unstructured Overlays • Given a set of n nodes on the Internet, each node pick k neighbor, so that the average routing latency is low. • Short-Long: k/2 closest nodes+ k/2 random nodes • BinShort-Long: k/2 nodes self-bin nodes + k/2 other • BinShort-Long with Sample: k/2 closest nodes from a sample set of self-bin + k/2 other

Discussion • How to random select nodes given distributed environment.

Server - Selection • Select server in the same bin • If no such sever, select the sever with most_similar_bin to client’s

Stretch = (latency to selected server) / (latency to optimal server)

Performance is improved with landmarks increasing • Improvement rapidly saturates

- For TS-10K 1000 servers, rest clients

- Adjusted stretch ?

- For NLANR data

Discussion • Load unbalance • Select 1 sever from 1000 sever in a 10k nodes.

Conclusion • A simple, scalable, binning scheme to infer network proximity information • Applying this scheme to overlay construction and server selection can significantly improve application performance.

Thank you! Any questions?

Distributed Binning • Set of nodes independently partition into disjoint “bin” • Nodes within a single bin are relatively closer to one another than to nodes not in their bin • Small set of Landmark machines geographically distributed over the Internet to “measure” latency • Check average inter-bin and intra-bin latencies to ensure binning does the job

Distributed Binning Example

Distributed Binning Example • TS-10K and TA-1K: Transit-sub topologies with 10,000 and 1000 nodes • PLRG1 and PLRG2: Power-Law Random Graphs with 1166 and 1779 nodes • NLANR: National Lab for Applied Network Research based Active Measurement Project • Consisting of 100 active monitors that exchange information

Binning based Server Selection • If there exists one or more servers within the same bin as the client, then the client is redirected to a random server from its own bin • If no server exists within the same bin as the client, an existing server from another similar bin

Latency Stretch Comparison

Scalability • Each node only needs measure with small set of landmarks • At a million nodes on the network, refreshing at every hour, each landmark would approximately handle 2700pings/sec. • Better scalability by have multiple nodes at a location act as a single logical landmark.

Construction of CAN topologies using Binning • Ordering of landmarks is used for binning • m landmarks, m! orderings • Co-ordinate space divided along first dimension into m portions, each portion sub divided along second dimension into m-1 portions and so on • New node joins CAN at a random portion associated with its landmark ordering • Result • Co-ordinate space not uniformly populated • Uneven distribution of size of zone spaces (future work!)

Topologically-Aware Overlay Construction and Sever Selection

Topologically-Aware Overlay Construction and Sever Selection

Presentation Transcript

Correlation Aware Feature Selection

A Construction of Locality-Aware Overlay Network: mOverlay and Its Performance

Swarming Overlay Construction Strategies

Topology-Aware Overlay Construction and Server Selection

Topologically-Aware Overlay Construction and Server Selection

Sever

Topology-Aware Overlay Networks

Overlay Networks and Overlay Multicast

EGOIST Overlay Routing using Selfish Neighbor Selection

“Scalable and Topologically-aware Application-layer Multicast”

Construction and Testing of Construction Cycle 2 (CC2) Overlay

Topologically-Aware Overlay Construction and Server Selection

Furniture Construction and Selection

Topology-Aware Overlay Networks for Group Communication

A Construction of Locality-Aware Overlay Network: mOverlay and Its Performance