UBI529 Distributed Algorithms

UBI529 Distributed Algorithms Instructor : Kayhan Erciyes Course : T 13:30-16:30 Office Hour : R 13:30-16:30 http : //www.ube.ege.edu.tr/~erciyes/UBI529

Distributed Systems • Set of computers connected by a communication network. • Gives the user an illusion of a single comp • Old platform : Usually a number of WSs over a LAN • Now, ranges from a LAN to a sensor network to a mobile network • Each node in a DS : • is autonomous • communicates by messages • needs to synchronize with others • To achieve a common goal (load balancing, fault tolerance, an application..)

Distributed Systems • A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and communicating through an unreliable communication medium. • Our interest in distributed systems in this course is about • Study of algorithms rather than systems • We will model the distributed system as a graph G(V,E) where • V is the set of nodes (processors, process on a device) • E is the set of edges (communication links, wired or wireless)

Modern Distributed Applications • Collaborative computing • Military command and control • Shared white-board, shared editor, etc. • Online strategy games • Stock Market • Distributed Real-time Systems • Process Control • Navigation systems, Airline Traffic Monitoring (ATM) in the U.S. – largest DRTS Mobile Ad hoc Networks Rescue Operations, emergency operations, robotics Wireless Sensor Networks Habitat monitoring, intelligent farming Grid

The Internet (Internet Mapping Project, color coded by ISPs)

Distributed Systems: Architecture

Issues in Building Distributed Applications • Reliable communication • Consistency • same picture of game, same shared file • Fault-tolerance, high-availability • failures, recoveries, partitions, merges • Scalability • How is the performance affected as the number of nodes increase ? • Performance • What is the complexity of the designed algorithm ?

Distributed Real-time Systems • Correct and timely operation is imperative • Failure to respond within a deadline may result loss of lives and property • process control systems, navigation systems, nuclear plants, patient monitoring systems, Airline Traffic Control (ATC) • Hard real-time • Soft real-time • banking systems, airline reservation

Distributed Real-time Systems : Architecture

Future : Mobile Distributed (and Real-Time ??)Computing Systems • Wireless data communications • Mobile revolution is inevitable • Two important distributed algorithms Applications : • - Mobile ad hoc Networks (MANETs) • - Wireless Sensor Networks (WSNs) • Can we still apply DS principles ? • Problems : • Location is dynamically changing info • Security issues • Limited storage on mobile hosts

Uncertainty in Distributed Systems • Uncertainty comes from • differing processor speeds • varying communication delays • (partial) failures • multiple input streams and interactive behavior • Uncertainty makes it hard to be confident that system is correct • To address this difficulty: • identify and abstract fundamental problems • state problems precisely • design algorithms to solve problems • prove correctness of algorithms • analyze complexity of algorithms (e.g., time, space, messages) • prove impossibility results and lower bounds

Application Areas • These areas have provided classic problems in distributed/concurrent computing: • operating systems • (distributed) database systems • software fault-tolerance • communication networks • multiprocessor architectures

Course Overview • Part I : Distributed Graph Algorithms (appr. 5 weeks) : • Models of Distributed Computing • Graph Theory and Algorithms Review • Vertex and Tree Cover • Distributed Tree based Communications • Distributed MST • Distributed Path Traversals • Matching • Independent Sets and Dominating Sets • Clustering • Graph Connectivity

Course Overview • Part II : Fundamental Algorithms (approx. 6 weeks): • Time Synchronization • Distributed Mutual Exclusion • Election Algorithms • Global State • Termination Detection • Synchronizers • Message Ordering

Course Overview • Part III : Fault Tolerance (approx. 3 weeks) • Self-stabilization • Consensus • Failure Detectors • Recovery • For each topic : • describe algorithms, usually starting from a sequentail one and then describe well-known distributed algorithms • analyze the cost of these algorithms • explore limitations • Also mention applications that use the techniques

Distributed Algorithms : A Perspective (K. Erciyes 2007)

A message passing model • System topology is a graph G = (V, E), where • V = set of nodes (sequential processes) • E = set of edges (links or channels, bi/unidirectional) • Four types of actions by a process: • - Internal action -input action • - Communication action -output action

Axiom 1. Message m sent  message m received Axiom 2. Message propagation delay is arbitrary but finite. Axiom 3. m1 sent before m2m1 received before m2. A Reliable FIFO channel P Q

When a message m is rec’d Evaluate a predicate with m and the local variables; 2. if predicate = true then - update internal variables; - send zero or more messages; else skip {do nothing} end if Life of a process

Synchronism may occur at various levels Send & receive can be blocking or non-blocking Postal communication is asynchronous: Telephone communication is synchronous Synchronous communication or not? Remote Procedure Call, Email Synchrony vs. Asynchrony

Address spaces of processes overlap Shared memory model M1 M2 1 3 2 4 Concurrent operations on a shared variable are serialized

Communication via broadcast Limited range Dynamic topology Collision of broadcasts (handled by CSMA/CA) Modeling wireless networks RTS RTS CTS

One object (or operation) of a strong model = More than one objects (or operations) of a weaker model. Often, weaker models are synonymous with fewer restrictions. One can add layers (additional restrictions) to create a stronger model from weaker one. Examples HLL model is stronger thanassembly language model. Asynchronous is weaker thansynchronous. Bounded delay is stronger thanunbounded delay (channel) Weak vs. Strong Models

Stronger models - simplify reasoning, but - needs extra work to implement Weaker models - are easier to implement. - Have a closer relationship with the real world “Can model X be implemented using model Y?” is an interesting question in computer science. Sample problems Non-FIFO to FIFO channel Message passing to shared memory Non-atomic broadcast to atomic broadcast Model transformation

Non-FIFO to FIFO channel m2 m3 m4 m1 P Q buffer

Non-FIFO to FIFO channel • {Sender process P} {Receiver process Q} • var i : integer {initially 0} var k : integer {initially 0} • buffer: buffer[0..∞] of msg • {initially  k: buffer [k] = empty • repeatrepeat • send m[i],i to Q; {STORE} receive m[i],i from P; • i := i+1 store m[i] into buffer[i]; • forever • {DELIVER} while buffer[k] ≠ empty dobegin • deliver content of buffer [k]; • buffer [k] := empty k := k+1; • end • forever

Other classifications of models • Reactive vs Transformational systems • A reactive system never sleeps (like: a server) • A transformational (or non-reactive systems) reaches a fixed point after which no further change occurs in the system (Examples?) • Named vs Anonymous systems • In named systems, process id is a part of the algorithm. • In anonymous systems, it is not so. All are equal. • (-) Symmetry breaking is often a challenge. • (+) Easy to switch one process by another with no side effect. Saves logN bits.

Configuration • Vector of processor states (including outbufs, i.e., channels), one per processor, is a configuration of the system • Captures current snapshot of entire system: accessible processor states (local vars + incoming msg queues) as well as communication channels.

Termination • For technical reasons, admissible executions are defined as infinite. • But often algorithms terminate. • To model algorithm termination, identify terminated states of processors: states which, once entered, are never left • Execution has terminated when all processors are terminated and no messages are in transit (in inbufs or outbufs)

Finite State Machines for Modelling Distributed Algorithms • A Finite State Machine is a 5-tuple : • I : Set of Inputs • S : Set of States • S0 : Initial state • G : S X I -> S State Transfer Function • O : Set of outputs • Ç : S X I -> O Output Function

FSMs • Moore FSM : Output is dependent on the current state • Mealy FSM : Output is dependent on the input and the current state. It usually has less states than Moore FSM • A Moore FSM example : Parity Checker

Process FSMs Each process is a FSM. They execute the same FSM code but may be in different states.

Our Implementation Environment for Distributed Algorithms • - Design the distributed algorithm as an FSM • - Implement each FSM as Solaris/Linux Threads • - Use my “thread IPC” module for communication

The Unix fork • procid = fork() • Replicates calling process • Parent and child are identical except for the value of procid • Use procid to diverge parent and child: if (procid==0)do_child_processing else do_parent_processing

Process Concept • An operating system executes a variety of programs: • Batch system – jobs • Time-shared systems – user programs or tasks • the terms job and process almost used interchangeably. • Process – a program in execution; process execution must progress in sequential fashion. • A process includes: • program counter • stack • data section

UNIX Processes • Each process has its own address space • – Subdivided into text, data, & stack segment • – a.out file describes the address space • OS kernel creates descriptor to manage • process • Process identifier (PID): User handle for • the process (descriptor)

Creating/Destroying Processes • UNIX fork() creates a process • – Creates a new address space • – Copies text, data, & stack into new adress space • – Provides child with access to open files • UNIX wait() allows a parent to wait for a • child to terminate • UNIX execva() allows a child to run a • new program

Creating a UNIX Process • int pidValue; • ... • pidValue = fork(); /* Creates a child process */ • if(pidValue == 0) { • /* pidValue is 0 for child, nonzero for parent */ • /* The child executes this code concurrently with parent */ • childsPlay(…); /* A procedure linked into a.out */ • exit(0); • } • /* The parent executes this code concurrently with child */ • parentsWork(..); • wait(…); • ...

Child Executes Something else • int pid; • ... • /* Set up the argv array for the child */ • ... • /* Create the child */ • if((pid = fork()) == 0) • { • /* The child executes its own absolute program */ • execve(childProgram.out, argv, 0); • /* Only return from an execve call if it fails */ • printf(“Error in the exec … terminating the child …”); • exit(0); • } • ... • wait(…); /* Parent waits for child to terminate */ • ...

Example: Parent • #include <sys/wait.h> • int main (void) • { • if ((c=fork()) == 0){ /* This is the child process */ • execve("child",NULL,NULL); • exit(0); /* Should never get here, terminate */ • } • /* Parent code here */ • printf("Process[%d]: Parent in execution ...\n", getpid()); • sleep(2); • if(wait(NULL) > 0) /* Child terminating */ • printf("Process[%d]: Parent detects terminating child \n",c); • printf("Process[%d]: Parent terminating ...\n", getpid()); • }

What is a thread? • process: • an address space with 1 or more threads executing within that address space, and the required system resources for those threads • a program that is running • thread: • a sequence of control within a process • shares the resources in that process

Advantages and Drawbacks of Threads • Advantages: • the overhead for creating a thread is significantly less than that for creating a process • multitasking, i.e., one process serves multiple clients • switching between threads requires the OS to do much less work than switching between processes

Drawbacks: • not as widely available as longer established features • writing multithreaded programs require more careful thought • more difficult to debug than single threaded programs • for single processor machines, creating several threads in a program may not necessarily produce an increase in performance (only so many CPU cycles to be had)

Single and Multithreaded Processes

User Threads • Thread management done by user-level threads library • Examples • - POSIX Pthreads • - Mach C-threads • - Solaris threads

Kernel Threads • Supported by the Kernel • Examples • - Windows 95/98/NT/2000 • - Solaris • - Tru64 UNIX • - BeOS • - Linux

Multithreading Models • Many-to-One • One-to-One • Many-to-Many • Many-to-One : • Many user-level threads mapped to single kernel thread. • Used on systems that do not support kernel threads.

Many-to-One Model

One-to-One • Each user-level thread maps to kernel thread. • Examples • - Windows 95/98/NT/2000 • - OS/2

Pthreads • a POSIX standard (IEEE 1003.1c) API for thread creation and synchronization. • API specifies behavior of the thread library, implementation is up to development of the library. • Common in UNIX operating systems.

UBI529 Distributed Algorithms