1 / 28

TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM

TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM. L. E. Moser, P. M. Melliar Smith, D. A. Agarwal, B. K. Budhia C. A. Lingley-Papadopoulos University of California, Santa Barbara. INTRODUCTION. Totem provides reliable totally-ordered multicasting of messages over LANs

kory
Download Presentation

TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TOTEM: A FAULT-TOLERANT MULTICASTGROUP COMMUNICATION SYSTEM L. E. Moser, P. M. Melliar Smith,D. A. Agarwal, B. K. BudhiaC. A. Lingley-Papadopoulos University of California, Santa Barbara

  2. INTRODUCTION • Totem provides reliable totally-ordered multicasting of messages over LANs • Intended for complex applications with critical requirements for • fault tolerance • real-time performance • Exploits hardware broadcast of most LANs

  3. TOTEM SERVICES • Built as a hierarchy of protocols: Application layer Process group interface Multiple-ring protocol Single-ring protocol Physical medium

  4. Single-ring protocol • Built on top of a best-effort multicast service, using UDP to exploit the hardware broadcasts of the LAN • Converts these multicasts into the service of reliable totally ordered delivery of messages on a single LAN • Also provides fault-detection, recovery and configuration change service

  5. Multiple-ring protocol • Uses information from the process group interface above it • Provides total ordering of messages as well as network topology maintenance services

  6. Process group interface • Delivers messages to the application processes in the appropriate process groups • Provides process group membership services.

  7. Services provided by Totem • Two reliable totally ordered message delivery services: • Agreed delivery • Safe delivery • Both services deliver messages in a single system-wide total order that respects Lamport’s causal order

  8. Agreed Delivery • Guarantees that a processor will not deliver a message before it has delivered all prior messages that: • Have been issued by processors in the current configuration and • Have time-stamps within the duration of that configuration • All processes receive all messages in theorder they were sent

  9. Safe Delivery • Further guarantee that a processor will not deliver a message unless all processors in its configuration have received it (everyone or nobody). • All processes receive all messages in the same order at the same time

  10. Why Lamport’s causal order? • Otherwise processes that belong to two or more groups could receive message from different groups in different order • A and B both in groups G and H • A receives m from group G then m’ from group H and finally m’’ from group G • B could receive m’ from group H then m from group G and finally m’’ from group G

  11. Example Group G sends messages m and m’’to A and B Group H sends message m’to A and B Both A and B will receive m and m’’in the same order Without total ordering, A could receive m’ before m’’ and B could receive m’’ before m’

  12. Delivery guarantees • Extended virtual synchrony ensures that these guarantees are honored within every configuration • When a fault occurs, Totem forms a transitional configuration with a reduced membership • Message order is guaranteed even in the presence of network partitions

  13. Extended virtual synchrony (I) • We want to ensure that • Messages are received in the same order by all processes • All processes share the same view of the process group to which they belong

  14. Extended virtual synchrony (II) • Virtual synchrony model (K. Birman, ISIS) orders group membership changes along with the regular messages • Ensures that failures do not result in • Incomplete delivery of multicast messages • Holes in the causal delivery order • Problems remain if network can partition

  15. Extended virtual synchrony (III) • Extended virtual synchrony model (Totem) extends the virtual synchrony model to systems • Processes can fail and recover • Network can partition and remerge • Guarantees that same message sent to processes in two or more components of a partitioned network will be in a consistent order in all these components

  16. Ordering of messages • Messages are born-ordered: • Each message includes a time-stamp • Relative order of messages is determined by the message themselves as created by their senders

  17. The single-ring protocol (I) • Uses a circulating token containing among others: • A seq field with the sequence number of the last message that was sent • An aru field with the sequence number of the last message that has been received by all processors • Only the processor that holds the token can send a message

  18. The single-ring protocol (II) • aru field used to implement safe delivery: • Tells processors which messages have been received by every processor in the ring • Token also provides information about the aggregate message backlog of the processors on the ring • Results in a fairer bandwidth allocation among processors than FDDI

  19. Local membership protocol (I) • Part of the single-ring protocol • Allows • Inclusion of new or recovering processors • Deletion of faulty processors

  20. Local membership protocol (II) • Ensures: • Consensus among all members of a configuration about the configuration membership • Terminationas each configuration will be installed on every processor within a bounded time or not at all.

  21. The multiple-ring protocol (I) • Operates over several LANs linked by gateways • Each LAN is organized as a virtual token ring and managed by the single-ring protocol • Offers same services and same guarantees as single-ring protocol Ring A Ring B Ring C

  22. The multiple-ring protocol (II) • Uses Lamport’s timestamps and delivers messages in timestamp order • When a gateway forwards a message from one ring to another, it gives to the message a new sequence number for the new ring • Processor faults and network partitions are detected by the single-ring protocol

  23. Ring B C A Messages mB1, mB2 mC1 mA1, mA2, mA3 Message delivery (I) • Each processor maintains one recv_msgs list of messages received but not yet delivered for each ring from which it can receive messages

  24. Message delivery (II) • A processor will deliver a message as an agreed message as soon as • Message has the lowest time stamp of all the messages in its recv_msgs lists • None of these lists is empty • We could wait forever for rings that have no messages to send but for guaranteed vector messages

  25. Ring Messages (with Timestamps) A mA1(8), mA2(12), mA3(16) B mB1(9), mB2(13) C mC1(10) Example (I) • Consider the following recv_msgs list where the numbers in parentheses indicate the message timestamps.

  26. Ring Messages (with Timestamps) A mA1(8), mA2(12), mA3(16) B mB1(9), mB2(13) C mC1(10) Example (II) • We can deliver messages mA!, mB1 and mC1

  27. Ring Messages (with Timestamps) A mA2(12), mA3(16) B mB2(13) C --- Example (III) • We cannot deliver more messages because we might miss a message mCnfrom ring C with an earlier timestamp, say ts = 11.

  28. Related issues • Totem sends from time to time guaranteed vector messages • They specify among other things, which rings have sent messages • Processor faults and network partitions are detected by the single-ring protocol • Configuration and topology change messages have timestamps and are delivered in strict timestamp order

More Related