130 likes | 241 Views
Uncoordinated Checkpointing. The Global State Recording Algorithm. channel. node. The Model. Node properties No shared memory No global clock. Channel properties: FIFO loss free nonduplicating. C1:transfer $50. C1:empty. C1:empty. $500. $450. $450. $200. $200. $250.
E N D
Uncoordinated Checkpointing The Global State Recording Algorithm
channel node The Model Node properties • No shared memory • No global clock Channel properties: • FIFO • loss free • nonduplicating CS 5204 – Operating Systems
C1:transfer $50 C1:empty C1:empty $500 $450 $450 $200 $200 $250 C2:empty C2:empty C2:empty The Problem CS 5204 – Operating Systems
Distributed Snapshot (Global State Recording) • Motivation for recording a “consistent” state of the global computation: • checkpointing for fault tolerance (rollback, recovery) • testing and debugging • monitoring and auditing • Method: detecting stable properties in a distributed system via snapshots. A property is “stable” if, once it holds in a state, it holds in all subsequent states. • termination • deadlock • garbage collection CS 5204 – Operating Systems
Definitions Local State and Actions: local state: LSi message send: send(mij ) message receive: rec(mij ) time: time(x) send(mij ) LSi iff time(send(mij )) < time(LSi ) rec(mij ) LSj iff time(rec(mij )) < time(LSj ) Predicates: transit(LSi , LSj ) = {mij | send(mij ) LSi !( rec(mij ) LSj ) ) } inconsistent(LSi , LSj ) = {mij | !(send(mij ) LSi ) rec(mij ) LSj ) } Consistent Global State: i, j : 1 <= i, j <= n :: inconsistent( LSi , LSj ) = CS 5204 – Operating Systems
GlobalStateRecording Algorithm MarkerSending Rule for a Process p: for (each channel c, incident on, and directed away from p) { p sends one marker along c after p records its state and before p sends further messages along c; } MarkerReceiving Rule for a Process q: if (q has not recorded its state) then { q records its state; q records the state of c as the empty sequence; } else { q records the state of c as the sequence of message received along c after q's state was recorded and before q received the marker along c. } CS 5204 – Operating Systems
before receiving the marker, q changes its state and sends message D. empty empty M M S1 S2 S3 S0 q q q q p p p p empty empty M’ D q receives the marker and records its state (D) and the incoming channel as empty; q send marker M' on its outgoing channel. state A state A state A state B state D state D state C state C on receiving the marker, p records the channel as having message D empty recorded state q p D state A state D p records its state (A) and sends marker M on channel CS 5204 – Operating Systems
c1 500 500 p q c2 c4 c3 r = Marker M 500 Snapshot/State Recording Example CS 5204 – Operating Systems
M 10 c1 470 490 p q c2 20 c4 c3 10 r 500 Snapshot/State Recording Example (Step 1) CS 5204 – Operating Systems
c1 480 490 p q c2 M c4 20 M c3 10 25 r 475 Snapshot/State Recording Example (Step 2) CS 5204 – Operating Systems
20 c1 480 470 p q c2 M c4 20 c3 25 r M 485 Snapshot/State Recording Example (Step 3) CS 5204 – Operating Systems
c1 500 490 p q c2 c4 c3 25 r M 485 Snapshot/State Recording Example (Step 4) CS 5204 – Operating Systems
c1 500 515 p q c2 c4 c3 r 485 Snapshot/State Recording Example (Step 5) CS 5204 – Operating Systems