230 likes | 315 Views
Causal Logging : Manetho. Rohit C Fernandes 10/25/01. Manetho System Model. Non determinististic events Message Receive Internal event(Kernel call) Creation of a new process Output Commit Stable Storage + Volatile Memory. Manetho properties. Tolerate any number of simultaneous failures
E N D
Causal Logging : Manetho Rohit C Fernandes 10/25/01
Manetho System Model • Non determinististic events • Message Receive • Internal event(Kernel call) • Creation of a new process • Output Commit • Stable Storage + Volatile Memory
Manetho properties • Tolerate any number of simultaneous failures • Low failure-free overhead • Only failed processes roll back
Causal Logging : Intuition • Piggyback determinant of non-deterministic event on outgoing messages • Determinant? • Piggyback Antecedence Graphs
Antecedence Graph • Directed acyclic graph • Nodes : State Intervals • Edges : Happened before(immediate)
Receive Node • Two incoming edges • Fields • Receiver ID • Sender ID • Index of created state interval • Unique identifier of message
Internal Event Node • One incoming edge • Fields • Type of event • Replay information
Failure Free Operation • Each process maintains • AG of its current interval • Log that contains data and ID of each message sent • Message Send : Piggyback AG of current state interval
Optimization • Need not send complete AG • Incremental piggybacking • AG(i+1p) is a proper subgraph of AG(ip) • Process q communicates to p max j such that jp is in q’s AG • P sends AG (ip ) - AG (jp )
Information on Stable Storage • Checkpoints • AG (asynchronously) : Need not piggyback part of AG which is in disk • Output commit: Save AG to disk
Incarnation Numbers • Each process starts a new incarnation after recovery • Integer stored in stable storage • Tagged on outgoing messages • Messages from old incarnations discarded
Recovery Protocol • Recover(p,c,INCNUM,S) • Step 1 • INCNUM INCNUM+1 ; save INCNUM • INCVEC[p] INCNUM • G AG(pc) // stable storage
Recovery Protocol • Step 2 • For all q S, qp • (INQ,AGQ)remote call at q:GET_AG(p) • GGAGQ • INCVEC[q]INQ • For all q S, qp • Remote call at q: SEND_INC(p,INCVEC)
Recovery Protocol • Step 3 • mmax j such that pj G • Recover upto pm • Don’t send out application messages but log them • For receive, request message from sender’s log • Replay internal event