210 likes | 438 Views
Master/Slave Architecture Pattern. Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al. Problem. A system must be fault-tolerant (i.e., it keeps running even when individual components fail)
E N D
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al
Problem • A system must be fault-tolerant (i.e., it keeps running even when individual components fail) • Example: e-commerce web-site should continue operating even if a database server goes down • A system must produce accurate results with high probability, even when individual components fail (e.g., safety-critical systems) • Example: Airplane control software must produce accurate results, even if a piece of hardware malfunctions • A system needs to solve computationally-intensive problems (e.g., problems for which there are no efficient algorithms, problems with large data sets, etc.)
Solution • A Master component divides work into subtasks • The Master delegates subtasks to multiple independent Slaves • Slaves compute results for their subtasks in parallel, and return their partial results to the Master • The Master computes the final result based on the partial results returned by the Slaves
Solution : Fault Tolerance • Master delegates execution of the service to each slave • When the first Slave terminates, the result is returned to the client • Slaves are kept in synch, so if one of them fails, the system keeps running • Reliability is achieved through replication
Solution : Computational Accuracy • Slaves are not identical • Requires at least three slaves • Each slave provides a different implementation of the service • Master delegates execution of the service to each slave • Master compares results from slaves • If results agree, return to client • If results disagree, take appropriate action • Generate exception • Have slaves "vote" (return most common result)
Solution : Parallel Computation • A complex or very large problem needs to be solved • Use multiple CPUs/Cores and parallelism to substantially reduce the time required to solve the problem • Master divides problem into subtasks • Slaves solve subtasks in parallel • Master combines partial results into final problem solution • Much faster than sequential solution
Known Uses • Fault Tolerance • Hardware and software with "failover" capabilities • Network routers, Database servers, Web servers, … • Computational Accuracy • Any safety-critical system (airplanes, spacecraft, nuclear reactors) • Parallel Computation • Parallel sorting algorithms • Distributed compilation • Parallel test execution • Rendering farms (animation companies) • Factoring large numbers into prime factors • SETI (Search for Extraterrestrial Intelligence) • Use Internet-connected computers to analyze radio telescope data
Consequences • Provides fault tolerance, computational accuracy, and system speed up • Master/Slave is not always feasible because some tasks cannot be partitioned • Defining an AbstractSlave interface provides flexibility to add or exchange slaves without affecting the Master implementation • Hard to implement • Implementation will not be very portable if it strongly depends on the particular hardware configuration being used (e.g., optimization)
Implementation • Implementation can be relatively complex • Master/Slave systems are highly concurrent since we want slaves to work in parallel • Slaves are implemented as separate threads or processes • Masters and Slaves may run on one multi-processor machine, or they may be distributed across several computers (e.g., a cluster) • How will data be transferred between Master and Slave? Does each slave need its own copy of the data, or can they share? • Shared memory • Shared database • Over a network • There are tools and libraries available that make building parallel, distributed systems easier • PVM (Parallel Virtual Machine) – library for implementing parallel algorithms • MPI (Message Passing Interface) – standard API for implementing parallel algorithms