1 / 20

Fundamentals Stream Session 4: Group Communication

Distributed Systems. Fundamentals Stream Session 4: Group Communication. CSC 253 Gordon Blair, François Ta ïani. Overview of the Session. What is Group Communication about Reliability Guarantees Unreliable Multicast Reliable Multicast Atomic Multicast Ordering Guarantees Unordered

leif
Download Presentation

Fundamentals Stream Session 4: Group Communication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Systems Fundamentals Stream Session 4:Group Communication CSC 253 Gordon Blair, François Taïani

  2. Overview of the Session • What is Group Communication about • Reliability Guarantees • Unreliable Multicast • Reliable Multicast • Atomic Multicast • Ordering Guarantees • Unordered • FIFO ordering • Total Ordering • Causal Ordering • Total Ordering and the Stabilisation Problem G. Blair/ F. Taiani

  3. Introducing Group Communication • What is group communication? • Enables the multicasting of a message to a group of processes as a single action • Sender is unaware of the destinations for the message • Why group communication? • Support for replication • fault tolerance & performance  See W8/9 Lectures on FT • Support the efficient dissemination of data • Service discovery, publish/subscribe Associated reading: section 7.4 of Tanenbaum and Van Steen G. Blair/ F. Taiani

  4. A Typical Group Service interfaceGroupCommunicationService{ // creates a new group and returns the groups ID publicGroupIDgroupCreate(); // Adding & Removing a member to/from a group public void groupJoin ( GroupID group, Participant member);public void groupLeave ( GroupID group, Participant member); // multicasts a message to the named group with // the specified delivery semantics, and // optionally collects a number of replies publicMessages[] multicast ( GroupID group, OrderType order,Messages message, int nbReplies) } appli appli groupcomm groupcomm Crucial issue of delivery semantics: reliability & ordering G. Blair/ F. Taiani

  5. Reliability Guarantees • Unreliable multicast • Message sent to all members and may or may not arrive • Reliable multicast • Reasonable efforts are made to ensure delivery in spite of message losses • Can be based on positive or negative acknowledgements • No guarantees is the sender crashes during multicast • Atomic multicast • All members receive message, or none do • Main issue: tolerate the sender’s crash G. Blair/ F. Taiani

  6. Implementing Atomic Multicast • Originator sends a message to each member of the group, and awaits acknowledgements • If some acknowledgements are not received in a given period of time, re-send message; repeat this n times if necessary • If all acknowledgements received • then report success to caller • else remove offending members(s) from group • Each member must monitor the originator • On failure  one of the member must become 'originator' • Each member must also monitor for completion • Awaits arrival of a next message from the same originator to indicate previous multicast completed A ack ack B msg msg D ack msg C G. Blair/ F. Taiani

  7. ? x2 +1 ? Atomic Multicast: Notes (I) • Crashed members never receive the multicast message • Solved by deciding that crashed hosts don’t belong to any group • Importance of message ordering: • No guarantees of message ordering for parallel multicast B A C D A 5 2 4 2 2 4 B Time D 3 6 5 C 6 3 2 G. Blair/ F. Taiani

  8. Atomic Multicast: Notes (II) • What happens if a process joins or leaves during the multicast operation? • Everybody needs to agree about group membership to be able to take over the role of originator • But agreeing requires a new multicast operation! • Multiple “originators” might be active. • The previous algorithm does not handle this • For this we need ordering guarantees + something called virtual synchrony • The previous protocol does not scale well • The originator must wait for the ACKS of all group members • Known as ACK explosion G. Blair/ F. Taiani

  9. NACK(m1) last msg from B = 0 A m1 B m2 m1 m2 m1 counter = 1 C last msg from B = 1 Homework: represent the above protocol on a time-sequence diagram Avoiding ACKs Explosion • Using negative ACKs (abbreviated in NACKs) • If everything is fine the receiver does say anything • If a message is lost the receiver complains to the sender • Problem: How do we know that a message should be there? last msg from B = 0 ? A I’ve missed 1 message! B m2 counter = 0 counter = 1 counter = 2 C last msg from B = 0 last msg from B = 1 last msg from B = 2 G. Blair/ F. Taiani

  10. NACK Mechanism: Notes • ACK explosion is avoided • A NACK explosion might still occur but is less likely • Garbage collection problem • In theory sender should keep all past messages indefinitely • In practice past messages only kept for a “long enough” period • More advanced schemes possible • Limiting NACK instances using NACK broadcast • Hierarchical Feedback Control G. Blair/ F. Taiani

  11. Ordering Guarantees • Unordered multicast • No guarantees • FIFO (First In, First Out) • Messages sent from the same process are delivered in the order they were sent at different sites • Messages sent from different processes may be delivered in different orders at different sites • Totally ordered multicast • Consider messages m1 and m2 sent to the group by (potentially) different processes • Either m1 will be delivered before m2 or vice versa for all members of the group • Causally ordered multicast • As above, except the ordering of m1 and m2 is only important if a “happened-before” relationship exists between the messages G. Blair/ F. Taiani

  12. A Matter of Vocabulary • For some “Atomic Multicast” means • Atomic delivery • And total ordering • Tanenbaum and Van Steen use this definition • We use the meaning of Coulouris et al. • “A-tomic” originally means “that can’t be cut”, all-or-nothing • For us does not imply total-ordering • It’s only a matter of vocabulary • The concepts and algorithms are the same, the names change • But be sure to know what someone means when he/she says “atomic multicast” Atomic Multicast G. Blair/ F. Taiani

  13. Implementing Total Ordering • The sequencer approach (centralised) • All requests sent to a sequencer, where they are given an ID • The sequencer assigns consecutive increasing IDs • Requested arriving at sites are held back until they are next in sequence • Problems: sequencer = bottleneck + single point of failure • Other approaches • Distributed agreement to generate ids (as in Isis) • Assign timestamps from a (global) logical or physical clock G. Blair/ F. Taiani

  14. Total Ordering with Time-Stamps • Based on Lamport’s logical clocks (cf. last week lecture) • Messages belonging to same multicast operation are given the same timestamp • At destination messages delivered according to their timestamp • Logical clocks are needed to make sure delivery order is consistent with “happened before” relation • The key of the algorithm is to solve the stabilisation problem G. Blair/ F. Taiani

  15. mB:5.2 mB:5.2 mB:5.2 mD:7.4 mD:7.4 mD:7.4 Message are delivered by the local middleware in the correct order We know that A should not deliver mD yet but A itself has not way to know that mB is coming! Stabilisation problem • A message is stable when the local middleware is sure no other messages with lower timestamps will be received clockB = 5.2 B C A D clockD = 7.4 G. Blair/ F. Taiani

  16. Stabilisation Problem: Solution • Assumption: underlying network provides FIFO delivery • Messages from same sender delivered in same order as sent • If not provided can be implemented by consecutive IDs • On receiving a message mX a process • Puts the message in its local middleware queue • Multicasts ACK(mX) to all other group members • The acknowledgement is time-stamped with a higher timestamp than mX • A process delivers a message mX to the application when • This message is at the top of the local middleware queue • I.e. it has the lowest timestamp of all the messages in the queue • And all ACK(mX) for this message have been received G. Blair/ F. Taiani

  17. mB:5.2 mD:7.4 ackB(mD):8.2 mB:5.2 ackD(mD):8.4 ackB(mD):8.2 ackA(mD):9.1 mB:5.2 mD:7.4 mD:7.4 ackC(mD):11.3 A knows that all future message from B will have a time-stamp bigger than 8.2! mB:5.2 mD:7.4 ackA(mD) ackB(mD) sorted queue ackD(mD) Total Ordering with Time-Stamps:Final Algorithm (ACKs for mB not represented) clockB = 5.2 clockB = 8.2 B clockA = 8.1 clockA = 9.1 A C clockc = 11.3 clockc = 9.3 D clockD = 8.4 clockD = 7.4 G. Blair/ F. Taiani

  18. Notes On Previous Algo • Completely decentralised • Rapidly becomes messy • Distributed algorithm are hard to design and understand • Still not fault tolerant • All ACKs are needed. Any process crash blocks the system • Can be dealt with by using even more complex and more costly algorithms providing fault-tolerant consensus • Not scalable • ACKs needed from all group members! • Can be improved if we assume message latency is bounded See section 5.2.1 of Tanenbaum and Van Steen for revision G. Blair/ F. Taiani

  19. Example: Isis • What is Isis? • A framework for reliable distributed computing based on process groups • Offers a range of delivery semantics • Unordered (FBCAST), causally ordered (CBCAST), totally ordered (ABCAST), etc • Also provides group-view management and state transfer on join • Status • A successful product • Ongoing research on Horus, Ensemble, Java Groups G. Blair/ F. Taiani

  20. Expected Learning Outcomes At the end of this 4th Unit: • You should be able to define what a group communication service is • You should be able to explain the different reliability and ordering guarantees discussed today • You should appreciate scalability issues involves in reliable multicast • You should be able to distinguish total from causal ordering • You should understand the problem of stabilisation for total ordering G. Blair/ F. Taiani

More Related