1.5k likes | 1.69k Views
Messaging, MOMs and Group Communication and pub-sub systems. CS 237 Distributed Systems Middleware (with slides from Cambridge Univ and Petri Maaranen). Inter-processing Communication. Two message communication operations send and receive message means a sequence of bytes
E N D
Messaging, MOMs and Group Communication and pub-sub systems CS 237 Distributed Systems Middleware (with slides from Cambridge Univ and Petri Maaranen)
Inter-processing Communication • Two message communication operations • send and receive • message means a sequence of bytes • Two modes of communications • Synchronous • The sending and receiving processes synchronize at every message (both send and receive are blocking calls) • Asynchronous • Both send and receive can be non-blocking • Issues: • Reliability (all sent messages are received) • Ordering (messages received in the sent order)
IPC through Sockets • The most popular abstraction for IPC • Two protocols: • UDP [unreliable and no ordering guarantee] • Send/receive datagrams • Example applications: DNS, VoIP • TCP [reliable and ordering guarantee] • Send/receive bytes (called byte streams) • Example applications: HTTP, FTP, SMTP, Telnet
UDP Server Binds to a port that client knows Blocking receive Distributed Systems Concepts and Design by Coulouris, Dollimore
UDP Client blocking Distributed Systems Concepts and Design by Coulouris, Dollimore
TCP Server • Creates a server socket on a certain port and listens for new connections • On a connection, starts a new thread with that socket Open datastream and send/recv bytes
TCP Client Creates a socket and connects to the server Distributed Systems Concepts and Design by Coulouris, Dollimore
Message Presentation and Marshalling • Marshalling is the process of taking a collection of data items and assembling them into a form suitable for transmission in a message (or store) • Also, known as serialization • Unmarshalling in the reverse process • A few popular methods • CORBA’s CDR (Common Data Representation) • Java Object’s serialization • XML (textual representation of structured data) • JSON (JavaScript Object Notation) • Protocol buffers (used by Google for grpc)
Message Presentation and Marshalling {‘Smith’, ‘London’, 1984} CORBA’s CDR Java Object’s Serialization
Message Presentation and Marshalling XML {‘Smith’, ‘London’, 1984} message person { required string name = 1; required string place = 2; optional int year = 3; } { “person” : { “name”: “Smith”; “place “ : “London”; “year” : 1984 } } JSON Protocol buffers
Indirect Communications • Indirection is a fundamental concept in computer science, and there is a famous saying: • “All problems in computer science can be solved by another level of indirection” • Indirect communication enables entities to talk through an intermediary with no direct coupling between the sender and the receiver(s) • Two types of uncoupling • Space uncoupling: sender does not need to know the identities of the receivers and vice versa • Time uncoupling: sender and receiver should not required to exist and operate at the same time
Indirect Communication Space and time coupling in distributed systems Distributed Systems Concepts and Design by Coulouris, Dollimore
Indirect vs Asynchronous Communication • Asynchronous communication and time uncoupling are different • In asynchronous communication • a sender sends a message and then continues (without blocking), and hence there is no need to meet in time with the receiver to communicate • Time uncoupling adds the extra dimension that the sender and receiver(s) can have independent existences • for example, the receiver may not exist at the time communication is initiated
Indirect Communication Techniques • Group communication • communication is via a group abstraction with the sender unaware of the identity of the recipients • Publish-subscribe systems • a family of approaches that all share the common characteristic of disseminating events to multiple recipients through an intermediary • Message queue systems • messages are directed to the familiar abstraction of a queue with receivers extracting messages from such queues • Shared memory–based approaches • including distributed shared memory and tuple space approaches, which present an abstraction of a global shared memory to programmers
Message Queues • Message queues are point-to-point form of indirect communications • Simple idea: sender process places the message into a queue, and it is then removed by a receiver process • Also called Message-Oriented Middleware (MOM) • Mainly used in Enterprise Application Integration (EAI) -- integration between applications within a given enterprise • Used commercial transaction processing systems • Examples: • IBM’s WebSphere MQ • Microsoft’s MSMQ • Oracle’s Streams Advanced Queuing (AQ)
cf: www.cl.cam.ac.uk/teaching/0910/ConcDistS/ Message-Oriented Middleware (MOM) • Communication using messages • Synchronouus and asynchronous communication • Messages stored in message queues • Message servers decouple client and server • Various assumptions about message content Client App. Server App. Message Servers message queues local message queues local message queues Network Network Network Middleware ‹#›
cf: www.cl.cam.ac.uk/teaching/0910/ConcDistS/ Properties of MOM Asynchronous interaction • Client and server are only loosely coupled • Messages are queued • Good for application integration Support for reliable delivery service • Keep queues in persistent storage Processing of messages by intermediate message server(s) • May do filtering, transforming, logging, … • Networks of message servers Natural for database integration Middleware ‹#›
Message-Oriented Middleware (4) Message Brokers • A message broker is a software system based on asynchronous, store-and-forward messaging. • It manages interactions between applications and other information resources, utilizing abstraction techniques. • Simple operation: an application puts (publishes) a message to the broker, another application gets (subscribes to) the message. The applications do not need to be session connected. TJTST21 Spring 2006 ‹#›
(Message Brokers, MQ) • MQ is fairly fault tolerant in the cases of network or system failure. • Most MQ software lets the message be declared as persistent or stored to disk during a commit at certain intervals. This allows for recovery on such situations. • Each MQ product implements the notion of messaging in its own way. • Widely used commercial examples include IBM’s MQSeries and Microsoft’s MSMQ. TJTST21 Spring 2006 ‹#›
Message Brokers • Any-to-any The ability to connect diverse applications and other information resources – The consistency of the approach – Common look-and-feel of all connected resources • Many-to-many – Once a resource is connected and publishing information, the information is easily reusable by any other application that requires it. TJTST21 Spring 2006 ‹#›
Standard Features of Message Brokers • Message transformation engines – Allow the message broker to alter the way information is presented for each application. • Intelligent routing capabilities – Ability to identify a message, and an ability to route them to appropriate location. • Rules processing capabilities – Ability to apply rules to the transformation and routing of information. TJTST21 Spring 2006 ‹#›
Adea Solutions[2]: Adea ESB Framework • ServiceMix[3]: ServiceMix (Apache) • [4]: Synapse (Apache Incubator) • BEA: AquaLogic Service Bus • BIE: Business integration Engine • Cape Clear Software: Cape Clear 6 • CordysCordys: Cordys ESB • Fiorano Software Inc.Fiorano Software Inc. Fiorano ESB™ 2006 • IBMIBM: WebSphere Platform (specifically WebSphere Message Broker or WebSphere ESB) • IONA TechnologiesIONA Technologies: Artix • iWay Software: iWay Adaptive Framework for SOA • MicrosoftMicrosoft: .NETMicrosoft: .NET Platform Microsoft BizTalk ServerMicrosoft: .NET Platform Microsoft BizTalk Server [5] • ObjectWebObjectWeb: Celtix (Open Source, LGPL) • Oracle: Oracle Integration products • Petals Services Platform: EBM WebSourcing & Fossil E-Commerce (Open Source) • PolarLake: Integration Suite • LogicBlaze: ServiceMix ESB (Open Source, Apache Lic.) • Software AG: EntireX • Sonic Software: Sonic ESB • SymphonySoftSymphonySoft: Mule (Open Source) • TIBCO Software • Virtuoso Universal Server • webMethods: webMethods Fabric Vendors TJTST21 Spring 2006 ‹#›
Conclusions • Message oriented middleware -> Message brokers-> ESB • Services provided by Message Brokers • Common characteristics of ESB • Products and vendors TJTST21 Spring 2006 ‹#›
cf: www.cl.cam.ac.uk/teaching/0910/ConcDistS/ IBM MQSeries • One-to-one reliable message passing using queues • Persistent and non-persistent messages • Message priorities, message notification • Queue Managers • Responsible for queues • Transfer messages from input to output queues • Keep routing tables • Message Channels • Reliable connections between queue managers • Messaging API: Middleware ‹#›
cf: www.cl.cam.ac.uk/teaching/0910/ConcDistS/ Java Message Service (JMS) • API specification to access MOM implementations • Two modes of operation *specified*: • Point-to-point • one-to-one communication using queues • Publish/Subscribe • cf. Event-Based Middleware • JMS Server implements JMS API • JMS Clients connect to JMS servers • Java objects can be serialised to JMS messages • A JMS interface has been provided for MQ • pub/sub (one-to-many) - just a specification? Middleware ‹#›
cf: www.cl.cam.ac.uk/teaching/0910/ConcDistS/ Disadvantages of MOM • Poor programming abstraction (but has evolved) • Rather low-level (cf. Packets) • Request/reply more difficult to achieve, but can be done • Message formats originally unknown to middleware • No type checking (JMS addresses this – implementation?) • Queue abstraction only gives one-to-one communication • Limits scalability (JMS pub/sub – implementation?) Middleware ‹#›
Generalizing communication • Group communication • Synchrony of messaging is a critical issue • Publish-subscribe systems • A form of asynchronous messaging
Group Communication (GC) • Communication to a collection of processes • Called process group • GC is also called (reliable) multicast • Group communication can be exploited to provide • Simultaneous execution of the same operation in a group of workstations • Software installation in multiple workstations • Consistent network table management • Who needs group communication? • Highly available servers (i.e., replicated servers) • Conferencing • Cluster management • Distributed Logging….
What type of group communication ? • Peer • All members are equal • All members send messages to the group • All members receive all the messages • Client-Server • Common communication pattern • replicated servers • Client may or may not care which server answers • Diffusion group • Servers sends to other servers and clients • Hierarchical • Highly and easy scalable Svrs Clients
GC: Open and Closed Groups • Closed group: • Only members of the group may multicast to it • A process in a closed group delivers to itself any message that it multicasts to the group • Example: replicated servers (one sending updates to another) • Open group • Processes outside the group may send to it • Example: delivering events to groups of interested processes
Group Communication (GC) Examples • Consider a replicated service case • Replicated identical servers, OR • Data are replicated to multiple servers multicast clients clients replicated identical servers data replicated to servers • Not a single message can be missed (reliable multicast) • All messages should be delivered in the same order (ordered multicast)
Basic Multicast (B-multicast) • B-multicast (Basic multicast) • B-multicast sends the message to each member in a loop • Not reliable, since the sender may crash when sending • Note that in group communication, the term “deliver” is used rather than “receive” • A multicast message is said to be “delivered” to the receiving process • Incoming messages are “received” and hold in a hold-back queue and then are “delivered” to the app when the conditions are met receive
Reliable Multicast • Delivering a message reliably to a set of processes (i.e., group) • Properties: • Integrity: A correct process pdelivers a message mat most once • Validity: If a correct process multicasts message m, then it will eventually deliver m • Agreement: If a correct process delivers message m, then all other correct processes in group(m) will eventually deliver m
Reliable Multicast (R-Multicast) • R-multicast: • Each process multicasts the received message again to all processes • Delivers the message when it’s received from another process
Group Communication (GC) Issues • Ordering • Receiving messages in some defined order • Delivery Guarantees (Reliability) • Not missing out any single message sent • Reliable communication • Membership • Managing who are the members of the group • Failure • Handling failure of processes TCP GC reliable and ordering guarantees reliable and ordering guarantees point-to-point service on IP group service on IP
Ordering Service • Unordered • Single-source FIFO • First-in-first-out (FIFO) ordering (also referred to as source ordering) is concerned with preserving the order from the perspective of a sender process, in that if a process sends one message before another, it will be delivered in this order at all processes in the group • Causal ordering • Causal ordering takes into account causal relationships between messages, in that if a message happens before another message in the distributed system this so-called causal relationship will be preserved in the delivery of the associated messages at all processes • Total ordering • In total ordering, if a message is delivered before another message at one process, then the same order will be preserved at all processes
Ordering Service • Unordered • Single-Source FIFO (SSF) • If pi sends m1 before it sends m2, then m2 is not delivered at pj before m1 is • Causally Ordered • If m1 happens before m2, then m2 is not delivered at pi before m1 is • Totally Ordered • If m1 is delivered at pi before m2 is, the m2 is not delivered at pj before m1 is
OrderingService Delivery order is same in all processes Total P1 sends F1 and then F2, so is the order in other processes FIFO C1 happened before C3, so is the order in all processes Causal
GC: FIFO Ordering (deliver in the order they were sent) • Maintains a hold-back queue • DO NOT deliver until it is time (holds-back) • Maintains a seq# to check the order • Each process maintains two seq# • S: the next seq# for sending (for itself) • R: the last seq# of delivered msg (for each process) • FIFO Order: • Main idea: Deliver message only if R matches with S
GC: Total Ordering (all processes deliver in the same order) Sequencer • Through a Sequencer • Serialize through a “sequencer” • Each process orders messages in that order • Distributed sequencing (used in ISIS) • The sender multicasts the message but does not deliver yet to itself • All other processes reply with a proposed seq# • The sender picks the largest #seq and send all processes the agreed seq# • The agreed seq# is set to the msg Ask next seq seq TO-Mcast
Delivery guarantees • Agreed Delivery • Guarantees total order of message delivery and allows a message to be delivered as soon as all of its predecessors in the total order have been delivered. • Safe Delivery • Requires in addition, that if a message is delivered by the GC to any of the processes in a configuration, this message has been received and will be delivered to each of the processes in the configuration unless it crashes.
Membership • Messages addressed to the group are received by all group members • If processes are added to a group or deleted from it (due to process crash, changes in the network or the user's preference), need to report the change to all active group members, while keeping consistency among them • Every process maintains a local “view” of group members • Every message is delivered in the context of a certain configuration, which is not always accurate. However, we may want to guarantee • Failure atomicity • Uniformity • Termination
Failure Model • Failures types • Message omission and delay • Discover message omission and (usually) recovers lost messages • Processor crashes and recoveries • Network partitions and re-merges • Assume that faults do not corrupt messages ( or that message corruption can be detected) • Most systems do not deal with Byzantine behavior • Faults are detected using an unreliable fault detector, based on a timeout mechanism
Some GC Properties • Atomic Multicast • Message is delivered to all processes or to none at all. May also require that messages are delivered in the same order to all processes. • Failure Atomicity • Failures do not result in incomplete delivery of multicast messages or holes in the causal delivery order • Uniformity • A view change reported to a member is reported to all other members • Liveness • A machine that does not respond to messages sent to it is removed from the local view of the sender within a finite amount of time.
Group membership: View and View Delivery • Group membership can change on the fly (when a certain computation is happening) • Processes may join, leave and crash • Still group communication needs to occur reliably and consistently • Membership is maintained through view and view delivery • Each process maintains a “View” • Local knowledge who are members • Whenever a view is changed (due to join or leave), the change is multicast to the group (view delivery) • Ideally, all processes must have on the same view