190 likes | 291 Views
Algorithm for Virtually Synchronous Group Communication. Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group. Talk Outline. Motivating applications Group Communication Virtual Synchrony Existing solutions Problems with existing solutions Our idea
E N D
Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group
Talk Outline • Motivating applications • Group Communication • Virtual Synchrony • Existing solutions • Problems with existing solutions • Our idea • Our contributions • Implementation
Motivation: Modern Distributed Applications • Highly available servers • Web • Video-on-Demand • Collaborative computing • Military command and control • Shared white-board, shared editor, etc. • Online strategy games • Stock Market
Common, Important Issues in Building Distributed Applications • Reliable communication • Consistency • same picture of game, same shared file • Fault-tolerance, high-availability • failures, recoveries, partitions, merges • Scalability • Performance
Group Communication -- Useful “Building Block” • Group Abstraction • processes interact in a group • dynamic: fail/join/partition/merge • Reliable Group Multicast • Group Membership -- generates “views” • tell each process who it is connected to • Systems: Ensemble, Horus, Isis, Newtop, Psync, Sphynx, Relacs, Totem, Transis
Virtual Synchrony • Synchronization of Messages and Views: • Powerful abstraction for replication • Semantics: VS [Birman, Joseph 87], EVS, SVS Procs that go together through same views, deliver same sets of messages.
Virtual Synchrony: How To? • Before moving into new view: • Need to know which synch msgs to use, since there may be several view proposals Exchange synch messages (“flush”) to agree which msgs to deliver in old view.
Existing Solutions • Limit Reconfiguration • Do not allow joins during reconfiguration • When someone wants to join: • first, deliver view without joiner; • then, start new reconfiguration. • Use common id to identify synch msgs for same view proposal
Problems with Existing Solutions • Limited Reconfiguration • Obsolete views delivered to application • Creates overhead • Limits usefulness of virtual synchrony • Use of common id to identify synch msgs • Pre-agreement or dissemination is required • Costly, especially in WANs
Our Idea • Don’t limit reconfiguration • Issue locally unique id per process for each view proposal • Tag synch msgs with these local ids • View includes vector of latest local ids • View is a triple: e.g., < 4, {p, q, r}, [8, 9, 3] > • Procs use sync msgs identified by view • Hence, procs use right sync msgs
Contributions • Algorithm for Virtual Synchrony • useful commonly provided semantics • processes can join during reconfiguration; hence, more benefit from Virtual Synchrony • one round, in parallel with view reconfiguration • Architecture • external membership service ([KSMD, 00]) • Formal Treatment • specs, algorithm, safety and liveness proof
Implementation • VS library (C++), linked with application • Use [KSMD,00] membership service implemented in C++, socket interface with members • Reliable FIFO layer (made in Hebrew University), uses IP multicast and recovers lost messages, library --- linked with VS