330 likes | 419 Views
Distributed Systems (Credit Hours: 3).
E N D
Distributed Systems(Credit Hours: 3) This course covers advanced topics in distributed systems with a special emphasis on distributed computing, and the services provided by distributed operating systems. Important topics include parallel processing, remote procedure call, concurrency, transactions, shared memory, message passing and scalability. Reference Books: 1. Distributed Systems: Concept and Design by Coulouris, Dollimore, and Kingberg Distributed Operating Systems by Andrew S. Tanenbaum
Course Evaluation Attendance & Class Participation 05 Assignments 10 Critical Reviews 10 Mid Term 25 Final Term 50 Total Marks: 100
Overview • Multiprocessing (Parallel processing). • Tightly coupled processors. • Distributed system (DS). • Loosely coupled processors (Distributed). • Key features of DS. • Pros and Cons of DS.
Parallel Processing From the beginning, computer scientists had challenged computers with larger and larger problems. Eventually, computer processors were combined on the same board together in parallel to work on the same task together by sharing the same memory. This is called parallel processing.
Parallel Processing • Types of Parallel Processing. • MISD • SIMD • MIMD
Cont…… Processors are multiple MISD – Multiple Instruction stream, Single Data stream SIMD – Single Instruction stream, Multiple Data stream MIMD – Multiple Instruction stream, Multiple Data stream
MISD One piece of data is broken up and sent to many processor. For searching a specific record. CPU Data CPU Search CPU CPU Example: An unsorted dataset is broken up into sections of records and assigned to several different processors, each processor searches the sections of data base for a specific key.
MISD An other example may be: Multiple cryptography algorithms attempting to decrypt a single coded message
SIMD SIMD (Single Instruction, Multiple Data) Is a technique applied to achieve parallel execution from a set of processor with data level parallel processing.
Data CPU Data CPU Multiply Data CPU Data CPU SIMD Multiple processors execute the same instruction of separate data. Ex: A SIMD machine with 100 processors could multiply 100 numbers, each by the number three(3), at the same time.
SIMD • Single instruction: all processing units execute the same instruction at any given clock cycle • Multiple data: each processing unit can operate on a different data element
MIMD Multiple processors execute different instruction of separate data. Data CPU Multiply Data CPU Search Data CPU Add Data CPU Subtract This is the most complex form of parallel processing. It is used on complex simulations like modeling the growth of cities.
MIMD • Currently the most common type of parallel computer • Multiple instruction: every processor may be executing a different instruction stream • Multiple data: every processor may be working with a different data stream
Tightly Coupled Processors (H/W concepts) • e.g., Multiprocessors, in which two or more CPUs share a main memory. • More difficult to build than multi-computers. • Easier to program (Desktop programming).
Multiprocessing system • Each processor is assigned a specific duty but, processors work in close association possibly sharing one memory module. • These CPUs have local cashes and have access to a central shared memory. The IBM p690 Regatta is an example of a multiprocessing system. (Mainframe).
Multiprocessors Consist of some number of CPUs, all connected to a common bus along with a memory module Bus-based multiprocessors require cashing. With caching, memory incoherence becomes an issue Write-through cache (Updating): Any update goes through to the actual memory (not only the cache) Snooping (snoopy) cache (Reading): Every cache monitors the bus, picks up any write-through to memory and applies them to itself, if necessary It is possible to put about 32 or possibly 64 CPUs on a single bus
Fig: A bus-based multiprocessor CPU CPU CPU Memory Cache Cache Cache Bus
Why Multiprocessors? 1. Microprocessors as the fastest CPUs Collecting several is much easier than redesigning one. 2. IL multithreading is limited due to data dependency on one processor. 3. Improvement in parallel softwares (scientific apps, databases, OS) needs multiprocessors.
Introduction to distributed systems Definitions A distributed system is a collection of independent computers that appear to the users of the system as a single computer Tanenbaum A distributed system is one in which hardware components located at networked computers communicate and coordinate their actions only by passing messages. Coulouris, Dollimore, Kindberg
Distributed Computing • Distributed computing is the process of aggregating the power of several computing entities to collaboratively run a computational task in a transparent and coherent way, so that it appears as a single, centralized system. • A distributed computer system is a loosely coupled collection of autonomous computers connected by a network using system software to produce a single integrated computing environment.
Features of DS • Distributed computing consists of a network of autonomous nodes. • Loosely coupled. • Node do not share the primary or secondary storage. • A well designed distributed system does not crash if a node goes down.
Cont.. • If you are to perform a computing task which is parallel in nature, scaling your system is a lot cheaper by adding extra node, compared to getting a faster single machine. • Of course, if your processing task is highly non-parallel (every result depends on the previous), using a distributed computing system may not be very beneficial.
Cont… • Network connections are the key feature. • Establishing Remote access is by message passing b/w nodes. • Messages are from CPU to CPU. • Protocols are designed for reliability, flow control, failure detection etc. • Mode of communication between nodes is by sending and receiving the network messages.
Distributed OS vs Networking OS • With network operating system each machine runs an entire operating system. • The machines supporting a distributed operating system are running under a single operating system that spans the network.
Cont… • While in NW OS the entire node is itself distributed across the network. • Thus the print task might be running on one machine, the file system on an other. Thus each machines co-operates as for the current software part of the DOS.
Advantages of DS over Centralized system. • Better price/performance than mainframes. • More computing power (Parallel and distributed). • Requests for some applications. • Improved reliability because system can survive crash of one processor. • Incremental growth can be achieved by adding one processor at a time. • Shared ownership facilitated.
Disadvantages of DS. • Network performance parameters. • Latency: Delay that occurs after a send operation is executed before data starts to arrive at the destination computer. • Data transfer rate: Speed at which data can be transferred between two computers once transmission has begun. • Total network bandwidth: total volume of traffic that can be transferred across the network in a given time.
Disadvantages of DS. • Dependency on reliability of the underlying network. • Higher security risk due to more possible access points for intruders and possible communication with insecure systems. • Software complexity.
Loosely Coupled Processors in H/W concepts • e.g., Multi-computers, in which each of the processors has its own memory. • Easy to build and are commercially available (PCs). • More complex to program (Desktop+ Socket programming).
DS consists of workstations on a LAN Workstation Workstation Workstation Local memory CPU Local memory CPU Local memory CPU Network
Software Concepts • Network Operating Systems (NOS) • Loosely-coupled software on loosely-coupled hardware • e.g., a network of workstations connected by a LAN • Each user has a workstation for his exclusive use • Offers local services to remote clients • Distributed Operating Systems (DOS) • Tightly coupled-software over loosely-coupled hardware • Creating an illusion to a user that the entire network of computers is a single timesharing system, rather than a collection of distinct machines (single-system image)
Software Concepts (Cont’d) • DOS • Users should not have to be aware of the existence of multiple CPUs in the system • No current system fulfills this requirement entirely • Multiprocessor Operating Systems • Tightly-coupled software on tightly-coupled hardware • e.g., UNIX timesharing system with multiple CPUs • Key characteristic is the existence of a single run queue (same memory) • Basic design is mostly same as traditional OS; however, issues of process synchronization, task scheduling, memory management, and security become more complex as memory is shared by many processors
SummaryComparison of three different ways of organizing n CPUs