Replicated State Machines

Replicated State Machines ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University September 2011

A simple State machine Object-oriented Message-oriented process SM{ (m,args)= getMessage(); switch m { case m_1: ... sendMessage(OB,m,arg) ... ... } class SM { void method m_1(par_1) { ... OB.m(arg); ... } ... } Note: Asynchronous communication, cf. Module 1

Constraints • Asynchronous message passing (unbounded buffering). Thus it must be proved no buffer-overflow for an implementation. • No timing (delays, timeouts) in state machines. State machines are scheduled as a set of periodic or sporadic processes

Fault Tolerance • Byzantine failures: SMs may fail in any way. Requires 2t+1 replicas to tolerate t failures. • Fail-stop failures: Failing processors stop and the stop state is detectable. Only t+1 replicas needed.

Agreement and Order • Every request message is received by every non-faulty processor. This requires reliable message passing – a fault in a particular link translates to a byzantine failure for the receiving state machine • Requests are processed in order. Requests sent from same destination cannot overtake each other. Cf. TCP and UDP in Internet

Agreement IC1: Select a non-faulty transmitter IC2: Ensure that the value sent by the transmitter is recieved by all other non-faulty processors The difficult part is implementing a move of the transmitter, cf. Token rings. Alternative. Broadcasts

Watch-dogs for Fail-stop Logical clock stability test

Dynamic Configurations C – clients S – state machines O – output devices This state machine could be the watch dog.

Integration after repair • Resynchronization with getting a check-pointed state from a replica. • Alignment with received messages.

Perspective • A general paradigm suitable for highly critical distributed processing. • Fail-stop may be feasible for medium level criticality. • Both may become cost-efficient in a multi-core setting. Requires highly dependable hardware and kernel support.

Replicated State Machines

Replicated State Machines

Presentation Transcript

COMP541 State Machines

Mindstorms State Machines

State Machines

Finite State Machines

EAP State Machines

Paxos and Replicated State Machine (RSM)

State Machines

Tolerating Latency in Replicated State Machines through Client Speculation

State Machines

Paxos: Agreement for Replicated State Machines

COMP541 State Machines

State Machines

State Machines

State Machines

Finite State Machines

Tolerating Latency in Replicated State Machines through Client Speculation

Finite state machines

State Machines