210 likes | 357 Views
Detecting and Reducing Partition Nodes in Limited-routing-hop Overlay Networks. Zhenhua Li and Guihai Chen State Key Laboratory for Novel Software Technology Nanjing University, Nanjing, P. R. China lizhenhua@dislab.nju.edu.cn, gchen@nju.edu.cn. Background. Overlay networks
E N D
Detecting and Reducing Partition Nodes in Limited-routing-hop Overlay Networks Zhenhua Li and Guihai Chen State Key Laboratory for Novel Software Technology Nanjing University, Nanjing, P. R. China lizhenhua@dislab.nju.edu.cn, gchen@nju.edu.cn
Background • Overlay networks - base infrastructures of many Internet applications • Limited routing hops - routing one hop in the overlay network is much more expensivethan that in the underlying network. - flooding or flooding-based routing mechanism - so has a limit called TTL
Motivation • Overlay partition - seriously degrade the system performance • Existence of topologically-critical nodes - some nodes’ failure will cause overlay partition with much higher possibility than others
Related work (1) • Proactive avoid and Event driven - using a centralized server to direct nodes’ join and leave - but the server becomes a single point of failure • Proactive avoid and Periodical detect - CAM: actively detect cut nodes and then neutralize them into normal nodes - but cut nodes are not applicable to limited-routing-hop overlay networks
Related work (2) • Reactive recover and Event driven - ring partition detect and repair on Pastry and SkipNet - but they can only be used on ring topology • Reactive recover and Periodical detect - cross-check method: ask other nodes to do random queries and compare their results with its own - but it has much randomness and uncertainty of detection
Our proposed ideas • The concept of partition node - topologically-critical nodes of limited-routing-hop overlay networks • Partition node detection and reduction - a distributed proactive method to detect partition nodes - reduce partition nodes by changing them to normal nodes - greatly enhance the connectivity and fault tolerance of overlay networks
Outline • Partiton node concept • Partition node detection • Partition node reduction • Performance evaluation
Partition node concept (1) • Cut node vs. partition node. - (a) (b) C is a cut node because when C fails, the overlay network is partitioned; - (c) (d) C is a partition node because when C fails, the overlay network is not partitioned, but C’s neighbors 1, 3, 5, 7 can no longer find each other.
Partition node concept (2) Definition 1 (Locatability) In a limited-routing-hop overlay network, node A could locate node B only if A can find B by sending routing messages. It is denoted by A→B. Definition 2 (Reachability) In a limited-routing-hop overlay network, node A could reach node C if A can locate C, or A can locate some node B and B can locate C. It is denoted by A→→C.
Partition node concept (3) • Example: Node 1 can only locate nodes 2, 3, 4, and can reach node 5, 6, 7, but cannot reach node 8.
Partition node concept (4) • Definition 3 (Partition Node) Node C is a partition node if C’s neighbor set would be partitioned into two or more unreachable subsets S1, S2, . . . , Sn (n≥2) when C fails. • Example:
Partition node detection (1) 4 steps: • Initialize detection (0)(1) • Probe reachability (2a)(2b) • Partition subsets (3) • Make decision (4)
Partition node reduction (1) • Add edges to reduce partition nodes - choose an appropriate delegate node Ni from each subset Si, - and then connects all the delegate nodes in some way. - In order to improve the system’s fault tolerance, we try to make every node’s degree above a constant lower bound as much as possible.
Partition node reduction (2) • Linear chain connection vs. Chordal ring connection - more edges, but much more resilience
Partition node reduction (3) • Remove edges to limit node degree - the new edges added to reduce a partition node cannot be removed; - remove the edge whose corresponding node has the highest load factor. • Total cost of partition node detection and reduction - n: tatal number of nodes, t: TTL, c: average node degree - total cost is
Performance evaluation (1) • Partition nodes’ significance to overlay topology.
Performance evaluation (2) • Effectiveness of our method
Performance evaluation (3) • Fault tolerance improvement
The End Thanks!