210 likes | 294 Views
300 Kms. Bologna. Role of Group Communication in BS Architecture (or: Which platform are we going to use ?). Alberto Bartoli University of trieste. BS will certainly use some form of replication BS will certainly use some form of GC. Group Communication. Group Communication (GC)
E N D
300 Kms Bologna Role of Group Communication in BS Architecture(or: Which platform are we going to use ?) Alberto BartoliUniversity of trieste
BS will certainly use some form of replication • BS will certainly use some form of GC Group Communication • Group Communication (GC) • Suite of communication and membership primitives • Very useful for implementing replication algorithms • In particular, in the presence of failures(host, network)
Options Reliable Broadcast • JavaGroups • Used in JBoss clustering extensions • Implemented in Java (stack of layers) • Spread • Used in a variety of environments • Implemented in C (Java interface available) • JBora • Used in my lab only… • Thin Java layer on top of Spread Uniform Broadcast(much more powerful) Novel idea (?)“Primary Uniform” Broadcast(much simpler to use)
m m m • “Frequent” (informal) requirement:If a process executes A and then crashes,Then A must be executed also by all processes that do not crash(“actions must not be lost”) Replication • Action A executed upon receiving a multicast • Execute a method on a local object • Update the serialized state of a local bean • Commit a transaction • ...
Failure Membershipchange Executing A might not be safe ! Reliable broadcast ( JavaGroups) Either all correct processes deliver a message or none of them does No guarantee on processes that are not “correct” !!!
If a process delivers a message,then all correct processes deliver that message NO NO NO Executing A is always safe ! Uniform broadcast ( Spread)
In practice • “Cross-the-fingers” reliability(don’t know or don’t care) • “Real” reliability • Uniform broadcast • Reliable broadcast + Additional measures • Replicated Databases: if a replica commits a transaction not committed by other replicas, undo the transaction later (Bettina) • Replicated Services: whenever a replica crashes, surviving replicas fetch from all clients the last reply they have received (Karamanolis, Magee — IEEE TSE) • Replicated Data: wait for an explicit responsefrom every available replica(myself, Ozalp — JPDC) (JBoss SFSB clustering)
“Real reliability” • Reliable broadcast + Additional measures: ( JavaGroups) • Each additional measure is ad-hoc • Each additional measure is complex(many failure patterns to consider and to cope with) • Lot of work above the GC layer • Uniform broadcast: ( Spread) • No additional measures • Systematic approach, complexity within the GC layer • Very little work above the GC layer
“The view is about to split;GC can’t tell whether the processes that are about to leavehave received the messages that follow” In-doubt message:can’t tell who will receive this! All correct processeswill receive this Again the same problem ! Uniform broadcast:Hhmmm…. If the network can partition,you need additional measures again !!!
Very simple reasoning for the “common case” Processes that leave the primary view deliver a prefixof the sequence of messages in the primary view Non-Primary View 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 Primary View Executing A is always safe ! No need for additional measures ! 9 10 11 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 JBora
So what ? • JavaGroups • Spread • JBora
Scenario 1 We want to rely (almost) completely on(some snapshot of) JBoss clustering • My suggestion: • Forget about uniform multicast (Spread, JBora) • Stick with JavaGroups • Preliminary WP1 Meeting (Bologna, Trieste) • Encapsulating Spread within JavaGroups • Encapsulating JBora within JavaGroups • …too complex, dubious advantages • (see meeting slides for details)
Scenario 2 We don’t want to run behind JBoss clustering(write our own clustering features) • My suggestion: • We are not interested in uniform multicast Use JavaGroups • We are interested in uniform multicast Use JBora
My opinion • If we use JavaGroups • I will ask to restructure WP1(Task 1.3 “Support for group communication”) • Month 18: Dear reviewer, Trieste has led Task 1.3. We did almost nothing. • If we decide to write our own clustering features • I don’t see any single reason why we should eliminateuniform multicast from the beginning
Experiments:“Throughput under stress” • Each sender injects 1000 msg/sec (bursty) • All details available in a separate document(4 PIII 800 MHz, Windows 2000, Ethernet 100 MB) • Important findings about JavaGroups(configured as in JBoss clustering): • Processes may start missing messages and this occurs silently(no failure notification whatsoever) • You cannot start / recover multiple processes simultaneously(they do not discover each other) • Does not seem very “reliable” (at least, when stressed)
FIFOReliable TotalUniform 150 ! Failed ! Verypreliminary... A few numbers... Spread JBora JavaGroups 1 sender(500 Byte) 640 576 561 2 senders 1254 871 496 1 sender(5 KBytes)323 323 165 2 senders 592 359 275 • Recall: • Spread, JBora: Message throughput Operation throughout • JavaGroups: Message throughput < Operation throughput(N responses for each multicast)
deliver m m ack Uniform broadcast:How is it implemented ? • Messages within the GC Layersfor one uniform broadcast • Uniform broadcast delivered only upon the second broadcast • In practice, many, many optimizations: • The white messages are not separated messages,but fields of other messages required anyway • Costly, but not as much as it seems
JBoss Clustering (I) m • Messages from the applicationlayer for one operation done • My belief: • Less efficient than uniform multicast • The application injects N one-to-one messages into the system
JBoss Clustering (II) • Devising all possible failure patterns and coping with them correctly is very, very complex • Difficult to achieve full confidence in the algorithm and its implementation m • I know from our JPDC work that coping with view changes here is VERY complex • Does JBoss really handle all cases correctly ? done
Transitional Views:Why cannot be avoided ? • Suppose a network failure during the protocol • GCLayer may end up with one side of the partitionthat does not know whether the other side hasreceived the message • Two approaches: • GCLayer waits until the partition recovers(not feasible) • GCLayer notifies the application of the new view after a warning