Distributed Operating Systems

Distributed Operating Systems

Coverage • Distributed Systems (DS) Paradigms • DS & OS’s • Services and Models • Communication • Distributed File Systems • Coordination • Distributed Mutual Exclusion (ME) • Distributed Co-ordination • Synchronization • DS Scheduling & Misc. Issues

What is a Distributed System “A distributed system is the one preventing you from working because of the failure of a machine that you had never heard of.” Leslie Lamport “A distributed system is a collection of independent computers that appear to the users of the system as a single computer.” Tanenbaum shared memory multiprocessor message passing multiprocessor distributed system  Multiple computers sharing (same) state and interconnected by a network

Distribution: Example Pro/Cons All the Good Stuff:High Performance, Distributed Access, Scalable, Heterogeneous, Sharing (Concurrency), Load Balancing (Migration, Relocation), Fault Tolerance , … • Bank account database (DB) example • Naturally centralized: easy consistency and performance • Fragment DB among regions: exploit locality of reference, security & reduce reliance on network for remote access • Replicate each fragment for fault tolerance • But, we now need (additional) DS techniques • Route request to right fragment • Maintain access/consistency of fragments as a whole database • Maintain access/consistency of each fragment’s replicas • …

Transparency: Global Access Fragmentation Hide wether the resource is fragmented or not • Illusion of a single computer across a DS • Distribution transparency: • Access transparency + location transparency + replication transparency + fragmentation transparency

Multiprocessor OS Types (1) • Each CPU has its own operating system • Shared bus  Comm. blocking & CPU idling! Bus

Multiprocessor OS Types (2) Master-Slave multiprocessors Bus • Master-Slave multiprocessors • Master is a bottleneck!

Multiprocessor OS Types (3) • Symmetric Multiprocessors • SMP multiprocessor model Bus • Eliminates the CPU bottleneck, but have issues associated to ME, synchronization • Mutex on OS?

OS’s for DS’s • Loosely-coupled OS • A collection of computers each running their own OS, OS’s allow sharing of resources across machines • A.K.A. Network Operating System (NOS) • Manages heterogeneous multicomputer DS • Difference: provides local services to remote clients via remote logging • Data transfer from remote OS to local OS via FTP (File Transfer Protocols) • Tightly-coupled OS • OS tries to maintain single global view of resources it manages • A.K.A. Distributed Operating System (DOS) • Manages multiprocessors & homogeneous multicomputers • Similar “local access feel” as a non-distributed, standalone OS • Data migration or computation migration modes (entire process or threads)

Network Operating Systems (NOS)

Distributed Operating Systems (DOS)

Client Server Model for DOS & NOS

Middleware • Can we have the best of both worlds? • Scalability and openness of a NOS • Transparency and relative ease of a DOS • Solution: additional layer of SW above NOS • Mask heterogeneity • Improve distribution transparency (and others)  “Middleware” (MW)

Middleware (& Openness) • Document-based middleware (e.g. WWW) • Coordination-based MW (e.g., Linda, publish subscribe, Jini etc.) • File system based MW • Shared object based MW

File System-Based Middleware (1) • Approach: make a DS look like a big file system • Transfer models: • Download/upload model (work done locally) • Remote access model (work done remotely) (b) (a)

File System-Based Middleware (2) (a) Two file systems (b) Naming Transparency: All clients have same view of FS (c) Some clients with different FS views

File System-Based Middleware (3) • Semantics of file sharing (ordering and session semantics) • (a) single processor gives sequential consistency • (b) distributed system may return obsolete value

Shared Object-Based Middleware (1) • Approach: make a DS look like objects (variables + methods) • Easy scaling to large systems • replicated objects (C++, Java) • flexibility • Main elements of CORBA based system • Common Object Request Broker Architecture Example 1: Main elements of CORBA (Common Object Request Broker Architecture) based system: - Object Request Broker (ORB) - Internet Inter-ORB Protocol (IIOP) inter-ORB protocol

Shared Object-Based Middleware (2) • Example 2: Globe [1] • provides a model of distributed shared objects and basic support services (naming, locating objects etc.) • an object is completely self-contained • designed to scale to a billion users, a trillion objects around the world GIDS: Globe Infrastructure Directory Service [1] M. van Steen, P. Homburg, and A. Tanenbaum. “Globe: A Wide-Area Distributed System”, 1999.

Shared Object-Based Middleware (3) A distributed shared object in Globe • can have its state copied on multiple computers at once • how to maintain sequential consistency of write operations?

OS’s, DS’s & MW

Network Hardware The Internet

Network Services and Protocols • Network services: Network Services (blocking) (non-blocking) • • Internet Protocol (IP) • • Transmission Control Protocol (TCP): Connection-oriented • Universal Datagram Protocol (UDP): Connectionless

Client-Server Communications Unbuffered msg. passing send(addr,msg), recv(addr,msg) all request and reply at C/S level all msg. acks between kernels only Buffered msg. passing msg. sent to kernel mailbox or kernel/user interface socket server client kernel kernel msg. directed at a process server client kernel kernel blocking? non-blocking?

Remote Procedure Calls (RPC) • Synchronous/Asynchronous (blocking/non-blocking) communication • [Sync] client generated request, STUB  kernel • [Sync] kernel blocks process till reply received from server • [ASync] buffers msg

RPC & Stubs (Dummy Procedure i.p.o RPC) [C] call “client stub” procedure [CS] prepare msg. buffer [CS] load parameters into buffer [CS] prepare msg. header [CS] send trap to kernel [K] context switch to kernel [K] copy msg. to kernel [K] determine server address (NS) [K] put address in header [K] set up network interface [K] start timer for msg [S] process req; initiate “server stub” [SS] call server [SS] set up parameter stack/unbundle [K] context switch to server stub [K] copy msg. to stub [K] see if stub is waiting [K] decide which stub to assign [K] check packet for validity [K] process interrupt (save PC, kernel state) NS C: Client; CS:Client Stub, [K] Kernel S: Server; SS: Server Stub, NS : Network service

Remote Procedure Call Implementation Issues: • Can we pass pointers? (local context…) • call by reference becomes copy-restore (but might fail) • Weakly typed languages (e.g., C allows computations, say product of arrays sans array size specs) • can client stub determine unspecified size to pass on? • Not always possible to determine parameter types • Cannot use global variables • C/S may get moved to remote machine

RPC Failures? • C/S failure vs. communication failure? • Who detects? Timeouts? • Does it matter if a node (C/S) failed • BEFORE or AFTER a request arrived? • BEFORE or AFTER a request is processed? • Client failure: orphan requests? add expiration counters • Server crash?

Communication • Delivers messages despite • communication link(s) failure • process failures • Main kinds of failures to tolerate • timing (link and process) • omission (link and process) • value

Communication: Reliable Delivery • Omission failure tolerance (degree k). • Design choices: • Error masking (spatial): several (> k) links • Error masking (temporal): repeat K+1 times • Error recovery: detect error and recover

Reliable Delivery (cont.) Error detection and recovery: ACK’s and timeouts • Positive ACK: sent when a message is received • Timeout on sender without ACK: sender retransmits • Negative ACK: sent when a message loss detected • Needs sequence #s or time-based reception semantics • Tradeoffs • Positive ACKs faster failure detection usually • NACKs : fewer msgs… Q: what kind of situations are good for • Spatial error masking? • Temporal error masking? • Error detection and recovery with positive ACKs? • Error detection and recovery with NACKs?

Resilience to Sender Failure • Multicast FT-Communication harder than point-to-point • Basic problem is of failure detection • Subsets of senders may receive msg, then sender fails • Solutions depend on flavor of multicast reliability • Unreliable: no effort to overcome link failures • Best-effort: some steps taken to overcome link failures • Reliable: participants coordinate to ensure that all or none of correct recipients get it (sender failed in b)

Distributed Operating Systems