430 likes | 527 Views
Doctoral Thesis. Failure Recovery of Overlay Tree-based Structures. Ing. Vladim í r Dynda Doc. RNDr. Ing. Petr Zem á nek, CSc. (supervisor). Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science and Engineering. Agenda. Introduction
E N D
Doctoral Thesis Failure Recoveryof Overlay Tree-basedStructures Ing. Vladimír Dynda Doc. RNDr. Ing. Petr Zemánek, CSc. (supervisor) Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science and Engineering
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Introduction • Problem statement TR= (TM\FC, CE’ ) T4 T = (TM, CE) TM T5 CE T6 T3 FC T0 T2 S= (N, L) T1 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 1
Introduction • Problem statement • Failure recovery • Reconnection ofT0, T1, ..., TN-1intoa restored network TR= (TM \FC, CE’) • Correctness – TR is acyclic • Completeness –TRcontains all the fragments Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 2
Introduction • Problem statement • Environment • Asynchronous distributed system • No central authority / no global knowledge • Unlimited sizes of S and T • Arbitrary traffic directionin T • Failures • Node failures only • Fail stop failure model • Failures must not split S Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 3
Introduction • Goals of the thesis • Proposal of a generic recovery platform • Illustration of the tree restoration methods • Simulation & verification of the theoretical properties • Survey of possible applications Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 4
Introduction • State of the art • On-demand / preplanned recovery • Preplanned methods • Employ pre-computed backup structures • Existing preplanned methods • Complete graph (Narada) • Ancestor list (Yang-Fei, EFTMRP, HMTP) • Administrative hierarchy (Nice, Nemo) • Secondary trees (Dual-tree, Coop-net) • Link to random nodes (HMTP, Yoid) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 5
Introduction • State of the art • Weaknesses of the existing methods • Poor scalability • Restricted set of applicable trees • Single points of failure • Fixed level of fault tolerance • Unrecoverable multiple failures • Non-local restoration Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 6
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
BR Platform • Bypass ring platform • Ensures correctness and completeness • Forms a basis for a tree reconnection • Fabric of redundant links in T: • Bypass rings of optional diameter • Alternative paths in the event of failure • Location & routing among the fragments Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 7
BR Platform • Failure recovery Bypass routing Tree reconnection Leader link election Bypass rings BC(FC) n1 Leader BRT(n1,4) BRT(n2,2) BRT(n1,3) BRT(n1,2) FC n1 n2 TR= (TM\FC, CE’ ) n2 T = (TM, CE) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 8
BR Platform • Elemental steps of the recovery • Initialization of the platform • Failure detection • Designated nodes discovery • Leader link election • Tree reconnection • Bypass rings reconfiguration Bypass routing Correctness & Completeness Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 9
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Bypass Routing • Partially ordered tree (POT) Ordered rays Ordered neighbor sequence R-(A0,3C) R+(A0,3C) 17 CE E8 9F BT(A0,3C) B9 72 67 79 09 0F 3C A0 93 B2 1D SeqT(A0) 24 SeqT(3C) 42 T = (TM, CE) 5E 4A F7 11 R+(A0,3C) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 10
Bypass Routing • Bypass ring BRT(n, d) R+(n,n1) R-(n,n0) dmax = 4 BT(n,n1) BRT(n,4) BRT(n,dmax) BRT(n,3) BT(n,n0) n1 BRT(n,2) n0 R-(n,n1) R+(n,n2) R+(n,n0) n2 n n3 R-(n,n3) SeqT(n) BT(n,n2) R+(n,n3) BT(n,n3) R-(n,n2) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 11
BRT(nm,dmax) BRT(n2,5) BRT(n2,4) BRT(n1,3) BRT(n1,2) Bypass Routing • Bypass rings R+(n,n1) ndmax n5 n4 n3 FC n2 n1 n BT(n,n1) T = (TM, CE) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 12
Bypass Routing • Routing algorithm • <FC>T = BT(ni, nj), njAT(ni) FC ni1 nj1 BC(FC) BT(ni2,nj2) BT(ni3,nj3) FC T = (TM, CE) nj3 R+(ni1,nj1) ni3 nj2 ni2 BT(ni1,nj1) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 13
BRT(A0,4) BRT(3C,3) BRT(3C,2) Bypass routing • Example BC(FC) R+(72,3C) CE 17 E8 9F 72 B9 0F 67 FC 79 09 3C A0 93 B2 1D 24 T = (TM, CE) 42 5E 4A F7 11 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 14
Bypass Routing • Properties • Memory overhead at node nT:O(degT(n) * dmax) • Routing is successful iflenX(ni, ni+1) dmax, X = R+(ni, nj)for all neighborsni andni+1 BC(FC) • Lower bound of maximum size ofFC:dmax/2 nodes for arbitrary clusters Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 15
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Leader Link Election • Leader link election(LLE) • Guarantees correctness • Communication structure – BC(FC) • Node states • Passive – initial state of the election • Active – leader candidates • Relay – election is lost Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 16
ID(nN-1) < ID(n0) Leader Link Election • LLE on ordered rings ID(n0) < ID(n1) < ... < ID(nN-1) Leader ELECTION(n0) n0 nN-1 ID(n0) < ID(n1) n1 ELECTION(n1) FC n6 n2 ID(n1) < ID(n2) n BC(FC) = BRT(n,2) SeqT(n) n5 n3 n4 <FCAT(FC)> Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 17
A1.BA < A1.16 Leader Link Election • LLE in partially ordered trees Sweep process Hierarchical identifier HIDT(nr,ni) ELECTION(4F.*) Leader BC(FC) R+ HIDT(4F,D8) D8 4F.A1.BA.D8 SWEEP(4F.A1) BA HIDT(4F,97) 97 4F.A1.BA.97 ELECTION(A1.BA.97) A1 4F HIDT(4F,16) 4F.A1.16 16 nr SeqT(nr) SeqT(A1) FC <FCAT(FC)> Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 18
17 9F 67 79 93 24 3C.A0 < 3C.A0 A0.B9 < A0.1D 42 5E 4A F7 11 Leader Link Election • Example CE Leader ELECTION(3C.A0.1D) E8 72 FC B9 SWEEP(3C.A0) 0F nr nr 09 3C A0 ELECTION(A0.B9.CE) B2 1D T = (TM, CE) <FCAT(FC)> Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 19
Leader Link Election • Properties • Average message complexity:O(N logbN); b is the average branching factor of FC nodes in T • Time complexity: O(N) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 20
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Tree Reconnection • Reconnection methods • Reconnect the fragments located by the routing algorithm • Abide by the results of LLE • Designed to meet the specific application requirements • Influence properties of the restored tree Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 21
Tree Reconnection • LR method BC(FC) n1 n2 n3 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 22
Tree Reconnection • HR-x method HR-1 (q0, qi) if i 1 (mod x) (qi-1, qi) otherwise BC(FC) n1 = q0 q3 q1 q2 q2 q1 n2 = q0 = q3 n3 q5 = q0 = q1 q4 q2 q3 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 23
Tree Reconnection • HR-x method HR-2 (q0, qi) if i 1 (mod x) (qi-1, qi) otherwise BC(FC) n1 n2 n3 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 24
17 9F 67 79 93 24 42 5E 4A F7 11 Tree Reconnection • Example CE ELECTION(3C.A0.1D) E8 72 FC B9 SWEEP(3C.A0) 0F 09 3C A0 ELECTION(A0.B9.CE) B2 TR= (TM\FC, CE’ ) 1D <FCAT(FC)> HR-2 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 25
Tree Reconnection • Properties Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 26
Tree Reconnection • Properties Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 27
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Summary of Results • Properties of the BR platform • Node memory overhead: • O(degT(n) * dmax) • Average message complexity: • O(N logbN) for arbitrary failures • Nfor single failures • Lower bound of max. recoverable failure: • dmax/2 nodes for arbitrary failed clusters • dmax-1 nodes for internal failed clusters Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 28
Summary of Results • Simulation results • Successfully recovered cluster • Average diameter: dmax-2 • Average size: 1.5 dmax • Linear recovery time • dmax parameter • Controls fault-tolerance vs. costs • dmax=4 provides ample tolerance for GFS Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 29
Summary of Results • Properties of the platform • Locality • Multiple failure recovery • Scalability • Application requirements consideration • Optional level of fault tolerance • Protection selectivity • Designated nodes discovery • Tree reconnection method • Independence of the protected tree type Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 30
Summary of Results • Applications • Overlay multicast • Applicable in all types • Network-layer multicast • Extension with BR(n,1) needed • Sample application – GFS multicast • Designed for large-scale P2P systems • Based on a layered administrative hierarchy • Employs BR platform to achieve fault-tolerance Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 31
Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures
Conclusion • Thesis summary • Analysis of overlay trees environment and identification of recovery properties • Proposal of BR platform • Design of the specialized leader election • Illustration of the tree reconnection • Simulation of the platform • Outline of the overlay multicast scheme Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 32
Conclusion • Ideas for further research • Autonomous management of fault-tolerance level and protection selectivity • More sophisticated tree reconnection methods • Extension of the platform fornetwork-layer multicast Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 33