1 / 53

Détecteurs de défaillances, mémoire partagée/passages de messages

Détecteurs de défaillances, mémoire partagée/passages de messages. Hugues Fauconnier LIAFA, Université Denis Diderot. Plan. Introduction Objectifs et contexte Objets et mémoire partagée Mémoire partagée linearisabilté I mplémentation wait-free Universalité du consensus

Download Presentation

Détecteurs de défaillances, mémoire partagée/passages de messages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Détecteurs de défaillances, mémoire partagée/passages de messages Hugues Fauconnier LIAFA, Université Denis Diderot

  2. Plan • Introduction • Objectifs et contexte • Objets et mémoire partagée • Mémoire partagée • linearisabilté • Implémentation wait-free • Universalité du consensus • Communication par messages • Détecteurs de défaillances • Implémentation de la mémoire partagée • Implémentation d'objets partagés • Hiérarchie du consensus et détecteurs de défaillances • Conclusion(s)

  3. Introduction et contexte • Possible – impossible (FLP) • Mémoire partagée - communication par échanges de messages • Objets partagés: • Comparaison et hiérarchie: • un test-and-set est-il plus puissant qu'un compare-and-swap? • Vers les transactions

  4. Introduction… • Détecteur de défaillances: • Détecteur minimal et comparaison (connaissance nécessaire et suffisante sur les pannes) • hiérarchie des problèmes • Consensus • Accord sur une valeur • Registres • Exclusion mutuelle • Le plus faible des plus faibles • K-set consensus (accord sur au plus k-valeurs)

  5. Shared memory • Set of processes p1, …, pn • (process=sequential thread) • Processes are asynchronous • a step can take an arbitrary (finite) time • Processes communicate trough shared data structures (objects) • examples: shared memory, test-and-set, queue..

  6. Objects: • an object is defined by its type • e.g.: the type of R is atomic register • the type of the object defines a set of possible states and a set of primitives operations • e.g.: the state of the register is the value stored, the primitives are read()write(v) • processes access objects by primitives operations

  7. Objects: • we consider here only atomic objects • a sequential specification defines the behavior of the object (a transition system) • linearizability (=atomicity) • operations of concurrent processes may overlap, but each operation appears to take effect instantaneously between its invocation and its response: • the operation appears to be atomic • crashes: • if a process crashes between an invocation and the corresponding response the operation completes or aborts • every invocation by correct processes terminates

  8. Example: atomic register • States : the value stored ( initially) • Operations: read() and write(v) • Sequential specification: • read() returns the value stored • write(v) changes the state of the register (the new state is v) • Linearizability: each time interval between a request / answer of an operation can be reduced to a point such that the history of read/write satisfies the specification

  9. Atomic register • With only one writer linearizability is here equivalent to: • a read returns the last value written • if a read is concurrent with a write the read returns either the previous written value or the value of a concurrent write() • if a read operation r precedes another read operation r' then r' cannot return a value written before the one returned by r • can be generalized to multi-writer atomic registers

  10. 0 Write 1 Write 0 Read 0 Linearizable 0 Write 1 Write 0 Read 1

  11. Linearizable? 0 Write 1 Read 1 Read 0 impossible

  12. Another example • consensus: sequential specification decide(1)/propose(*) decide(0)/propose(*) propose(0) propose(1)

  13. Another example • RMW RMW(r register, f function) returns value previous := r r :=f(r) return previous • from RMW we get test-and-set, swap, compare-and-swap.

  14. Implementation • Given some objects O1, …, Om and processes p1, …, pn is-it possible to implement another object O? • Wait-free implementation: • the implementation is correct (in an intuitive sense) • every invocation from correct processes terminates • moreover a correct process can always terminate its invocation with only its own steps (with objects O1,…,Om)

  15. Wait-free • Wait-free implementation • As each process can always finish the work alone, a wait-free implementation tolerate any number of (crash failure) • very strong assumption!

  16. Wait-free implementations • Consider k-consensus (i.e. consensus between k processes) • Let the consensus number for object X be the largest k such that k-consensus can be implemented with X and atomic registers • (clearly if consensus number for O is strictly greater than consensus number for O', there is no implementation for O using only O')

  17. Wait-free implementations • Results • registers have consensus number equals to 1 (FLP) • test-and-set has consensus number equals to 2 • … • for each n there some objects with consensus number n

  18. Example • FIFO queue: decide(v) returns val prefer[P]:=v if deq(q) =  then return prefer[P] else return prefer[Q] With FIFO and registers it is possible to get 2-consensus but not 3-consensus

  19. Results • Universality of consensus (Herlihy): the n-consensus is universal in a system of n processes: every object shared by n processes can be (wait-free) implemented with n-consensus and registers • (principle of the proof: with help of a n-consensus processes agree on the history of the object)

  20. Plan • Objects • shared memory model • linearizability • wait-free implementation • Main results: universality of consensus • Message passing • failure detectors • shared memory implementation • object implementation • Consensus Hierarchy with failure detectors • Conclusion

  21. Message passing • The previous results prove that generally (at least) objects with consensus number >1 cannot be implemented with only registers • Instead of sharing data structures it is interesting to consider message passing models • message passing: processes don't share data but can send and receive messages • (Note that message passing could be defined in the previous general framework– communication channels are then the shared data structures)

  22. Message passing model • Processes communicate by messages • Communication is asynchronous (no bound on communication delays) • Communication is point-to-point and reliable • Processes can fail by crashing • Message passing models are suitable and natural for networks • (shared objects models are more suitable for hardware)

  23. Message passing • In message passing it is interesting to implement objects: • objects are easier to work with • some objects are natural in message passing models (e.g. registers consensus)

  24. Atomic register: practical point of view • Data server • Ensure safety properties • If a value is written it is available (even if the writer disappears) • When a process ends its write() then all next read() will return this value (or a value written later) –note that the writer knows when the write ends

  25. Shared register implementation • With only one reader and one writer and a majority of correct processes (sketch): • for the k-th write • to write(v): the writer sends (v,k) to all processes and waits for receiving an "ack" from a majority of processes. • to read(): the reader asks all processes and waits for receiving an answer (v,k) from a majority of processes; the value read is the value with the greatest k • when a process receives (v,k) from the writer it stores (v,k) and then sends an "ack" to the server • when a process receives a query from the reader it answers with the stored (v,k).

  26. It works… • because: • by the majority assumption there is always at least one process that participates to the last write and the read. • then the read returns the last written value • (but this implementation is not really atomic: if the writer crashes during a write, next reads could returns the previous value or the new one. • It is not very difficult to fix it: the reader always value with maximal timestamp ) • (some classical algorithms enables to implement general atomic registers from atomic register with one reader and one writer)

  27. Implementation issues • in message passing there is no implementation of consensus (even if at most one process can crash) • the implementation of registers needs to have a majority of correct processes

  28. Then … failure detectors • The impossibility results come from crashes (without failure all these problems are easy to solve). • Then: add oracles giving (possibly unreliable) information about crashes. • what information about crashes of processes enable to solve the problem? • what information about crashes is needed?

  29. Failure detectors • distributed "oracle" F: • at each time t a process can ask the failure detector and gets an answer • (generally the answer is a list of processes suspected to be dead) • the output is not the same at each process • the output of failure detector F depends only on the history of crashes (not on the states of processes). • Example: perfect failure detector • output: lists of suspected processes • if p is in the list for q then p is crashed • if p is crashed then p will eventually belong to the list of suspected processes of q

  30. Failure detector comparison • Reduction: • Failure detector F is weaker than failure detector F' (F≤F') if F can be implemented from F' • ≤ defines a partial order

  31. Minimal Failure Detector • Given a problem P, F is a minimal failure detector for P if and only if • With help of F, P can be solved • if F' enables to solve P then F ≤ F' • Then if F is a minimal failure detector for P: • F encapsulates the information about crashes needed to solve P

  32. Minimal Failure Detector • Why look for the minimal failure detector? • find the needed information about crashes • compare problems: if the minimal failure detector for P is weaker than the minimal failure detector for P' then P is easier than P' • (from a practical point of view the knowledge of the minimal failure detector helps to find the assumptions on the underlying system to solve the problem)

  33. Then to implement Objects: • In message passing • for each object O find the minimal failure detector to implement O • from the comparison between these failure detectors we get an hierarchy on these objects • Then we get 2 hierarchies on objects • consensus number as defined before • minimal failure detector needed for the object

  34. S-register • Begin with registers (consensus number =1) • S-register is an atomic register in which only processes in S can read or write (but all processes may participate to its implementation)

  35. Weakest failure detector • with a majority of correct processes atomic registers can be implemented without failure detector • but without a majority of correct processes? • Failure detector Σ

  36. Failure detector ΣS • ΣS(p,t) (output for process p of failure detector ΣS at time t) is a list of trusted processes. (q Є ΣS(p,t) means that p considers that q is not dead at time t) • Intersection: for each process p, q in S, for each time tout t , t’ : ΣS(p,t)  ΣS(q,t’) is not empty (at least one process is trusted by p and q) • Completeness: There is a time t such that for each correct process in S for each time t’>t ΣS(p,t’) contains only correct processes

  37. Remarks • with a majority of correct processes ΣS can be implemented in asynchronous systems. • ΣS gives a kind of quorum (a quorum is a family of sets such that two elements of the family always have a non empty intersection).

  38. Theorem • ΣS is the weakest failure detector to implement S-register • sufficient part: adapt the previous algorithm • necessary part: more difficult…

  39. S-Consensus • S is a set of processes • S-consensus • processes in S propose value and have to (irrevocably) decide. The decision has to ensure: • Validity: the decision value has been proposed • Agreement: if p and q decide they decide the same value • Termination: every correct process eventually decides

  40. ΩS • ΩS(p,t) (output for p of failure detector ΩS at time t) is a process (the leader) • Eventual leader election: there is a time t, there is a correct process l, such that for every correct process p in S for all time t’>t ΩS (p,t’)=l • intuitively: after some time all processes agree on the same leader forever

  41. Theorem • ΣS*ΩS is the weakest failure detector for S-consensus. (ΣS*ΩS outputs both ΣS and ΩS)

  42. For the proof • (necessary condition) • Adaptation of the proof of Chandra, Hadzilacos et Toueg: from an S-consensus algorithm using a failure detector, implement ΩS • With reliable broadcast and S-consensus implement S-register, (then use the previous theorem)

  43. For the proof Sufficient condition process in S forever C:=1 +r mod n Send(Coord, v,r) to C wait for receiving (One,*,r) from C or suspect C in ΩS if receeived (One,w,r) then FromCoord:=w else undef Send(Keep,FromCoord,r) to all wait for receiving (Two,*,r) form all processes in ΣS If there only one value v received decide this value v send (decide,v) to all stop else if received only 2 values (w and undef) then v:=w

  44. all processes • When received (Coord,*,k) for the first time (let (Coord,x,k) this message ) • send (One,x,k) to all processes in S • When received (Keep,*,k) for the first time, (let (Keep,x,k) this message ) • send Two,x,k) to all processes in S

  45. k-consensus k-consensus = consensus between any subset of k processes Result: • for 2<=k<=n: The weakest failure detector for k-consensus is Σ*Ω

  46. proof (idea): • consider case k=2 • From the previous results: • the weakest failure detector for 2-consensus is the set of ΣS*ΩS for all subsets with 2 elements

  47. Proof • From these ΣS (S is the set of subsets with two elements) atomic registers can be implemented then we get Σ • From these ΩS (S is the set of subsets with two elements)it is possible to implement Ω: • let G=(X,E) the graph where X is the set of processes, and (p,q)ЄE if there is x such that q is an eventual leader pour Ω {p,x}. Consider the strongly connected components of: there is an unique sink connected component and this sink contains (eventually) only correct processes.

  48. q p p has q as leader the sink

  49. Proof (sketch) From this we deduce an algorithm for Ω :all processes approximate this graph and compute the sink: the output of the emulated failure detector is this sink. Eventually, this sink contains only correct processes. (then extract the same leader in this sink) Then we get Ω

  50. Corollary If the consensus number of atomic object T is 2: Then: • The weakest failure detector for T is Σ*Ω • Every failure detector implementing T implements any object. • (in other word T is universal for all n)

More Related