1 / 14

Distributed Algorithms for Failure Detection in Crash Environments

Distributed Algorithms for Failure Detection in Crash Environments. R. Cortiñas, A. Lafuente, M. Larrea Distributed Systems Group University of the Basque Country UPV/EHU. Guest Stars:  P ,  S and Omega.  P : s trong completeness, eventual strong accuracy

cayla
Download Presentation

Distributed Algorithms for Failure Detection in Crash Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Algorithms forFailure Detection inCrash Environments R. Cortiñas, A. Lafuente, M. Larrea Distributed Systems Group University of the Basque Country UPV/EHU

  2. Guest Stars: P,S and Omega • P: strong completeness, eventual strong accuracy • Eventually every process that crashes is permanently suspected by every correct process • There is a time after which correct processes are not suspected by any correct process • S: strong completeness, eventual weak accuracy • There is a time after which some correct process is never suspected by any correct process • Omega: eventual leader election • There is a time after which all the correct processes always trust the same correct process Master SIA – Sistemas Distribuidos

  3. The First P Algorithm [CT96] Master SIA – Sistemas Distribuidos

  4. p1 p2 p6 p5 p3 p4 Communication Optimality A ring arrangement of processes Master SIA – Sistemas Distribuidos

  5. p1 p2 p6 p5 p3 p4 Communication Optimality Communication-efficient algorithms: n links are used forever Master SIA – Sistemas Distribuidos

  6. p1 p2 p6 p5 p3 p4 Communication Optimality Communication-optimal algorithms: C links are used forever Master SIA – Sistemas Distribuidos

  7. Communication-optimal P Master SIA – Sistemas Distribuidos

  8. Communication-optimal Omega • We also propose an optimal implementation of S, the weakest failure detector for solving Consensus: • processes ordered: p1, ..., pn • heartbeat strategy • communication pattern: one-to-successors • based on a trusted process (instead of a list of suspected processes) Master SIA – Sistemas Distribuidos

  9. Communication-optimal Omega i) Initially, p1 starts sending messages periodically to the rest of processes, and all processes trust p1 p1 p2 p3 p4 p5 trusted1 = p1 trusted2 = p1 trusted3 = p1 trusted4 = p1 trusted5 = p1 Master SIA – Sistemas Distribuidos

  10. p1 p2 p4 p5 p3 Communication-optimal Omega ii) If a process does not receive a message within some timeout period from its trusted process pi, then it suspects pi and takes the next process pi+1 as its new trusted process trusted1 = p1 trusted2 = p1 trusted3 = p1 timeout on p1 trusted4 = p2 trusted5 = p1 Master SIA – Sistemas Distribuidos

  11. p1 p2 p4 p5 p3 Communication-optimal Omega iii) If a process trusts itself, then it starts sending messages periodically to its successors trusted1 = p1 timeout on p1 trusted2 = p2 trusted3 = p1 trusted4 = p2 trusted5 = p1 Master SIA – Sistemas Distribuidos

  12. p1 p2 p4 p5 p3 Communication-optimal Omega iv) If a process receives a message from a process pi preceding its trusted process, then it will trust pi again, increasing its timeout period with respect to pi trusted1 = p1 message from p1 trusted2 = p1 timeout_period21++ trusted3 = p2 message from p1 trusted4 = p1 timeout_period41++ trusted5 = p1 Master SIA – Sistemas Distribuidos

  13. Communication-optimal Omega • Lemma. With the previous algorithm, eventually all the correct processes will permanently trust the first correct process in p1, ..., pn • This property trivially allows us to provide the properties of S: • Eventual weak accuracy: by not suspecting the trusted process • Strong completeness: by suspecting all the processes except the trusted process Master SIA – Sistemas Distribuidos

  14. Communication-optimal Omega Master SIA – Sistemas Distribuidos

More Related