Self-Stabilization in Distributed Systems

Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi

Outline • Introduction • Concept of Self-Stabilization • Dijkstra’s Model of Self-stabilization • Super-Stabilization • Application of Stabilization on Sensor Network • Discussion

Distributed Systems • A distributed system consists of loosely connected machines which do not share a global memory. Each machine has the partial view of the global state • System consists of components : processes and inter-connections (message passing/shared memory) • Topology is formed by components as nodes and inter-connections as directed edges • Each component has a local state • Global system state is union of local states • Behavior of the system is represented by the global state and the transition between the states • Design of robust distributed system requires capability of recovery from unforeseen perturbances • Previous attempts cater to permanent failures by introducing redundant components • Transient malfunction (message corruption/sensor malfunction/memory error) can put the system to illegal state

Self-stabilization • A self-stabilizing system guarantees that • Regardless of the current state, the system is guaranteed to recover to a legal configuration in a finite number of steps • Remain in the legal configuration thereafter until a malfunction occurs • Examples of Self-stabilization • Mathematics : Newton-Raphson method for finding square root • Control Theory: Systems with feedback • Distributed Systems: Pioneered by Dijkstra • Questions? • Is it always possible to have self-stabilization? If not, under what conditions • How do we design algorithms that are self-stabilizing • Cost of self-stabilization and convergence rate?

Properties of self-stabilization • Closure • Some evaluation function (a predicate) P, once true given the state of the system S, is always true thereafter • Convergence • Regardless of the starting stated, P will be satisfied within a finite number of state transitions • OR: Given a starting state in which a predicate Q is satisfied, P will be satisfied within a finite number of state transitions • These properties must remain true even against an adversary • The adversary may have knowledge of the system’s design and cause damage or remove state

Using Self-Stabilization for Fault Tolerance • Why use self-stabilization for fault tolerance • In some sense, because it was not designed for fault tolerance • Fault tolerance is a side-effect of self-stabilizing behavior of a system • No case checking • Simply guaranteeing that a system is self-stabilizing with respect to some predicate assures that for that predicate, the system will never “fail” • Self-stabilization implies (finite time) recovery after failure • Existing approaches rely upon avoiding failure at all costs, and then handling only certain failure modes • No probabilistic guarantees necessary • Since the system must stabilize regardless of the initial state, one can assert that a self-stabilizing system is perfectly fault-tolerant

Dijkstra’s Self-Stabilization Techniques • His question: can a self-stabilizing system exist at all? • Yes. • Can the processes (finite state machines) be the same? • No. • Are some processes “privileged”? • Yes. • What do they do? • No(thing).

Dijkstra’s Self-Stabilization Techniques (2) • How does his K-state self-stabilizing system work? • Each • Can the processes (finite state machines) be the same? • No. • Are some processes “privileged”? • Yes. • What do they do? • No(thing).

Pursuer-Evader Problem in Sensor Networks • Problem : How do you design the motion of pursuer so that it can finally see the unpredictable evader? • Examples • Motion Tracking • Sensing occurrences of unlikely events • Past attempts of solving not applicable in sensor networks • Limited Computational Resources (no centralized algorithm) • Communication burden • Fault-prone environment • On-site maintenance infeasible

The Problem • Input • System consisting of large number of sensor node • Each node j has a neighbour set nbr.j that it can communicate with • Each node has clock synchronized with other nodes • Two disntinguished processes : Pursuer & Evador • Both processes are mobile changing location from node to node • Evador moves at a slower rate compared to pursuer • The strategy of evader movement is unknown to the network • Goal: Design a program for the motes and the pursuer so that the pursuer can “catch” the evader, i.e. guarantees pursuer and evader would come to the same node • A program consists of a set of variables, mote actions, pursuer actions and evader actions

Assumption • Evader Action • Evader moves picks a random neighbor and moves to the node • Fault Model • Transient fault model corrupt the state at each node • Transient fault also restart motes • We assume connectivity of the graph despite faults

Evader-centric Program • Each node stores a variable ts: latest timestamp that a node knows for the detection of evader • If a node detects an evader, it sets its ts to the current clock time • A node otherwise compares its own ts value with all the neighbors’ and stores the maximum (p representing the parent node) • One the tracking tree is formed, pursuer simply follows the parent link from each node to reach to the evader node

Evader-centric Program: Observations • The tracking tree is a spanning tree rooted at the mote where evader resides • The distance between pursuer and evader does not increase once the tree includes the pursuer • The pursuer catches the evader in at most M+2M*(a/1-a) steps • Stabilization: If we assume (ts < clock) and (parent is a neighbor), it can be proved that the above algorithm will stabilize • Performance: • Energy = deg * N • Time = D + 2D *(a/1-a)

Pursuer-Centric Program • Each node updates the ts when it detects evader • Motes communicate with neighbors only at the request of pursuer • When pursuer visits a node, it sets its ts to 0 and moves to the neighboring node with maximum ts

Pursuer-Centric Program: Observations • If the pursuer reaches a mote j where ts>0, the pursuer catches the evader in at most N*a/(1-a) • The pursuer catches the evader within O(N2*log(N)) • Stabilization: Initialization of ts = 0 • Performance Metrics • Communication : degree communications per step • Time : O(N2*log(N))

A Hybrid Pursuer-Evader Program • Modify Evader-centric program to limit the tracking depth to save energy • Modify pursuer centric program to exploit the tracking tree structure program • Remove cycles from the graph • Check to correct fake tree roots • Pursuer: if parent defined, follow the parent otherwise select a neighbor with maximum ts value • Performance Metrics • Communication: n + degree of communication at each step • Time: O((N-n)2log(N-n))

Discussion • Here we came to know about a self-stabilizing program to track presence of evader in a sensor network • The algorithm is tunable and energy-efficient • Limitations • Fault-model is limited – no communication related failure • Can the concepts be generalized to address generic sensor related failures? • Number of evaders and no of pursuers are limited to one • Need for clock synchronization can possibly be relaxed

Self-Stabilization in Distributed Systems

Self-Stabilization in Distributed Systems

Presentation Transcript

Voltage Stabilization Techniques in Power Systems

Time in Distributed Systems

Synchronization in Distributed Systems

Resource Management in Distributed Systems: Distributed File Systems

Self-Stabilization

Self-stabilizing Distributed Systems

Self-Stabilization: An approach for Fault-Tolerance in Distributed Systems

Security in Distributed systems

Scheduling in Distributed Systems

Distributed (Operating) Systems -Communication in Distributed Systems-

Snap-Stabilization in Message-Passing Systems

Introduction to Self-Stabilization

Dependable, Self-Adaptive, Self-Healing, Distributed Systems through Reflection

Self-stabilization in NEST

Self-stabilization

Snap-Stabilization in Message-Passing Systems

Around Self-Stabilization

Introduction to Self-Stabilization

Best Camera Stabilization Systems

Self-Stabilization: An approach for Fault-Tolerance in Distributed Systems

Self-stabilizing Distributed Systems

Snap-Stabilization in Message-Passing Systems