Distributed Systems

Distributed Systems

Processes and Threads • Processes are programs in execution • Sequential process: contains single thread of execution • Concurrent process: simultaneous interacting sequential processes (asynchronous, each has its own address space)- heavy weight processes • Between two processes some components are disjoint- can execute concurrently • Some are dependent- communication or synchronization is needed

Process may spawn new processes (subprocesses) • When process and subprocess share a common address space, but each has its local state- light-weight processes or threads

Two-level concurrency-concept ofa process

PCB contain information necessary for context switching of process and managed by native OS • Threads have local procedure and stack • State information of thread is represented by Thread Control Block(TCB) • TCB is managed by Thread Runtime library support • Thread Runtime library support is an interface between thread execution and native OS • TCB is local to a thread, PCB is shared among interacting threads • Overhead of context switching is less in thread

Process Descriptors • OS creates/manages process abstraction • Descriptor is data structure for each process • Register values • Logical state • Type & location of resources it holds • List of resources it needs • Security keys

Thread Applications • Threads are useful when implementing a server process that provides similar or related services to multiple client processes • In single thread server process, it has to wait until some conditions or operations to be completed. • Multiple copies of the same server process is created to service different requests simultaneously

User space thread implementation • Implementation issues are handling blocking system calls from thread and scheduling of threads • Processor time of process is multiplexed among the threads • Context switch is done by the run time procedure

Blocking system call from a thread is not trapped by the OS, but is routed to run time procedure • The run time procedure saves the TCB of the calling thread and loads the TCB of a thread that it selects into the hardware registers • So no true blocking of the system has occurred and the execution of the process continues with other threads

Context Switching overhead of thread is less since saving and restoring of only PC and SP is required • User can assign priorities to thread • Scheduling is nonpreemtive FCFS within a priority • Priority scheduling-preemptive • Sleep or yield primitives that relinquish execution of one thread to another

Thread primitives in thread package are • Thread management for thread creation, suspension and termination • Assignment of priority and other thread attributes • Synchronization and communication support such as semaphore, monitor and message passing

Kernel space thread implementation • Thread can be implemented at kernel level • Thread can be preempted easily • Scheduling of threads are treated normally, but there is greater flexibility and efficiency • A thread issuing system call be blocked without blocking other threads in the same process • Overhead for context switch is high

Combined user and thread space kernel implementation

Process models

DAG • precedence relationship between processes • Used to find out the total completion time of interacting processes • Communication happens only at the completion of a process and at the beginning of the its successor process • Process has finite lifetime • Asynchronous process graph- • indicates existence of communication paths • Used to study processor allocation for optimizing interprocessor communication overhead • how and when communication occurs is not specified • Processes live indefinitely

Three types of communication scenarios in asynchronous communication graph model • One-way (simplex) • Client/server (half duplex) • Peer(full duplex)

Time space model

Client Server model

Service-oriented communication model • Higher-level abstraction of IPC that can be supported by RPC or message passing communication

Minimal kernel • Agent servers or binder- bind the client process and the selected server process together • Standard horizontal or vertical partitioning of modules can be applied to the structure of servers

Time services • Events are recorded wrt each process’ own clock time • Clocks are used to represent time (a relative measure of a point of time) and timer (an absolute measure of a time interval); it is used to describe the occurrence of events in three diﬀerent ways: • When an event occurs • How long it takes • Which event occurs ﬁrst

There is no global time in distributed systems • Physical clock: close approximation of real time (both point and interval) • Logical clock: preservers only the ordering of events

Physical clocks • Compensating delay: • – UTC sources to time servers • – time servers to clients • • Calibrating discrepancy

Application of physical clocks: • Protocols rely on a time-out for handling exceptions • Timestamping for secure internet communication (avoiding play back attacks)

Logical clocks • For many applications, events need not be scheduled or synchronized with respect to the real-time clock • it is only the ordering of event execution that is of concern. Lamport’s logical clock is a fundamental concept for ordering of processes and events in distributed systems

Each process Pi in the system maintains a logical clock Ci • →: happens-before relation to synchronize the logical clock • a → b: Event a precedes event b • Within a process, if event a precedes event b, then C(a) < C(b)

The logical clock in a process is always incremented by an arbitrary positive number when events in the process progress • Processes interact with each other using a pair of send and receive operations: these are considered events as well

Rules for Lamport’s logical clock: 1. if a → b within the same process then C(a) < C(b) 2. If a is the sending event of Pi and b is the corresponding receiving event of Pj then Ci(a) < Cj(b) (can be enforced if the sending process timestamps its logical clock in the message and receiving process updates its logical clock using the larger of its own clock time and the incremented time stamp)

Implementation of the rules: 1. C(b) = C(a) + d 2. Cj(b) = max(TSa + d,Cj(b)), TSa is the timestamp of the sending event

The happens-before relation describes the causality between two events; it is transitive. Two events, a and b, are said to be disjoint events and can be run concurrently if neither a → b nor b → a. • Rules 1 and 2 result in partial ordering, so a third rule can be added: 3. For all events a and b, C(a)≠ C(b)

System-wide unique logical clock times for all events can be obtained by concatenating the logical clock with a distinct process ID number. • The happens-before relation has an important property: • Ci(a) < Cj(b) doesn’t imply a → b, i.e. we cannot distinguish disjoint events using values of logical clocks.

Vector logical clocks • For every event a process Pi maintains the vector V Ci(a) = [TS1, TS2, . . . ,Ci(a), . . . , TSn], where n is the number of cooperating processes, Ci(a) ≡ TSi is the logical clock for event a at Pi and TSk is the best estimate of the logical clock time for process Pk obtained through the timestamp information carried by messages in the system.

V Ci is initialized to zero vector at system startup • The logical clock within a process is incremented according to rule 1 • Rule 2 is modiﬁed: When sending message m from Pi (event a) to Pj, the logical timestamp V Ci(m) is sent along with m to Pj. Let b be the event of receiving m at j

The vector logical clock allows for identiﬁcation of disjoint events, because it is not possible to have V Ci(a) < V Cj(b) unless a → b.

Disjoint events: (b, f) • Causally related events: (a, e, h)

Matrix Logical Clock

Reader preference An arriving writer waits until there are no running readers • Writer preference An arriving reader waits until there are no running and waiting writers

Strong reader preference A waiting reader is scheduled over the waiting writers upon the completion of the running writer • weak reader preference Implicit scheduling, doesn’t care which is scheduled next • weaker reader preference A waiting writer is scheduled upon the completion of the running write

Our problem • Writer must wait until there are no active readers or writers.

Semaphore solution to weak reader preference problem

Monitor solution

Distributed Systems