1 / 89

UBI529 Distributed Algorithms

UBI529 Distributed Algorithms. Instructor : Kayhan Erciyes Course : T 13:30-16:30 Office Hour : R 13:30-16:30 http : //www.ube.ege.edu.tr/~erciyes/UBI529. Distributed Systems. Set of computers connected by a communication network. Gives the user an illusion of a single comp

ford
Download Presentation

UBI529 Distributed Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UBI529 Distributed Algorithms Instructor : Kayhan Erciyes Course : T 13:30-16:30 Office Hour : R 13:30-16:30 http : //www.ube.ege.edu.tr/~erciyes/UBI529

  2. Distributed Systems • Set of computers connected by a communication network. • Gives the user an illusion of a single comp • Old platform : Usually a number of WSs over a LAN • Now, ranges from a LAN to a sensor network to a mobile network • Each node in a DS : • is autonomous • communicates by messages • needs to synchronize with others • To achieve a common goal (load balancing, fault tolerance, an application..)

  3. Distributed Systems • A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and communicating through an unreliable communication medium. • Our interest in distributed systems in this course is about • Study of algorithms rather than systems • We will model the distributed system as a graph G(V,E) where • V is the set of nodes (processors, process on a device) • E is the set of edges (communication links, wired or wireless)

  4. Modern Distributed Applications • Collaborative computing • Military command and control • Shared white-board, shared editor, etc. • Online strategy games • Stock Market • Distributed Real-time Systems • Process Control • Navigation systems, Airline Traffic Monitoring (ATM) in the U.S. – largest DRTS Mobile Ad hoc Networks Rescue Operations, emergency operations, robotics Wireless Sensor Networks Habitat monitoring, intelligent farming Grid

  5. The Internet (Internet Mapping Project, color coded by ISPs)

  6. Distributed Systems: Architecture

  7. Issues in Building Distributed Applications • Reliable communication • Consistency • same picture of game, same shared file • Fault-tolerance, high-availability • failures, recoveries, partitions, merges • Scalability • How is the performance affected as the number of nodes increase ? • Performance • What is the complexity of the designed algorithm ?

  8. Distributed Real-time Systems • Correct and timely operation is imperative • Failure to respond within a deadline may result loss of lives and property • process control systems, navigation systems, nuclear plants, patient monitoring systems, Airline Traffic Control (ATC) • Hard real-time • Soft real-time • banking systems, airline reservation

  9. Distributed Real-time Systems : Architecture

  10. Future : Mobile Distributed (and Real-Time ??)Computing Systems • Wireless data communications • Mobile revolution is inevitable • Two important distributed algorithms Applications : • - Mobile ad hoc Networks (MANETs) • - Wireless Sensor Networks (WSNs) • Can we still apply DS principles ? • Problems : • Location is dynamically changing info • Security issues • Limited storage on mobile hosts

  11. Uncertainty in Distributed Systems • Uncertainty comes from • differing processor speeds • varying communication delays • (partial) failures • multiple input streams and interactive behavior • Uncertainty makes it hard to be confident that system is correct • To address this difficulty: • identify and abstract fundamental problems • state problems precisely • design algorithms to solve problems • prove correctness of algorithms • analyze complexity of algorithms (e.g., time, space, messages) • prove impossibility results and lower bounds

  12. Application Areas • These areas have provided classic problems in distributed/concurrent computing: • operating systems • (distributed) database systems • software fault-tolerance • communication networks • multiprocessor architectures

  13. Course Overview • Part I : Distributed Graph Algorithms (appr. 5 weeks) : • Models of Distributed Computing • Graph Theory and Algorithms Review • Vertex and Tree Cover • Distributed Tree based Communications • Distributed MST • Distributed Path Traversals • Matching • Independent Sets and Dominating Sets • Clustering • Graph Connectivity

  14. Course Overview • Part II : Fundamental Algorithms (approx. 6 weeks): • Time Synchronization • Distributed Mutual Exclusion • Election Algorithms • Global State • Termination Detection • Synchronizers • Message Ordering

  15. Course Overview • Part III : Fault Tolerance (approx. 3 weeks) • Self-stabilization • Consensus • Failure Detectors • Recovery • For each topic : • describe algorithms, usually starting from a sequentail one and then describe well-known distributed algorithms • analyze the cost of these algorithms • explore limitations • Also mention applications that use the techniques

  16. Distributed Algorithms : A Perspective (K. Erciyes 2007)

  17. A message passing model • System topology is a graph G = (V, E), where • V = set of nodes (sequential processes) • E = set of edges (links or channels, bi/unidirectional) • Four types of actions by a process: • - Internal action -input action • - Communication action -output action

  18. Axiom 1. Message m sent  message m received Axiom 2. Message propagation delay is arbitrary but finite. Axiom 3. m1 sent before m2m1 received before m2. A Reliable FIFO channel P Q

  19. When a message m is rec’d Evaluate a predicate with m and the local variables; 2. if predicate = true then - update internal variables; - send zero or more messages; else skip {do nothing} end if Life of a process

  20. Synchronism may occur at various levels Send & receive can be blocking or non-blocking Postal communication is asynchronous: Telephone communication is synchronous Synchronous communication or not? Remote Procedure Call, Email Synchrony vs. Asynchrony

  21. Address spaces of processes overlap Shared memory model M1 M2 1 3 2 4 Concurrent operations on a shared variable are serialized

  22. Communication via broadcast Limited range Dynamic topology Collision of broadcasts (handled by CSMA/CA) Modeling wireless networks RTS RTS CTS

  23. One object (or operation) of a strong model = More than one objects (or operations) of a weaker model. Often, weaker models are synonymous with fewer restrictions. One can add layers (additional restrictions) to create a stronger model from weaker one. Examples HLL model is stronger thanassembly language model. Asynchronous is weaker thansynchronous. Bounded delay is stronger thanunbounded delay (channel) Weak vs. Strong Models

  24. Stronger models - simplify reasoning, but - needs extra work to implement Weaker models - are easier to implement. - Have a closer relationship with the real world “Can model X be implemented using model Y?” is an interesting question in computer science. Sample problems Non-FIFO to FIFO channel Message passing to shared memory Non-atomic broadcast to atomic broadcast Model transformation

  25. Non-FIFO to FIFO channel m2 m3 m4 m1 P Q buffer

  26. Non-FIFO to FIFO channel • {Sender process P} {Receiver process Q} • var i : integer {initially 0} var k : integer {initially 0} • buffer: buffer[0..∞] of msg • {initially  k: buffer [k] = empty • repeatrepeat • send m[i],i to Q; {STORE} receive m[i],i from P; • i := i+1 store m[i] into buffer[i]; • forever • {DELIVER} while buffer[k] ≠ empty dobegin • deliver content of buffer [k]; • buffer [k] := empty k := k+1; • end • forever

  27. Other classifications of models • Reactive vs Transformational systems • A reactive system never sleeps (like: a server) • A transformational (or non-reactive systems) reaches a fixed point after which no further change occurs in the system (Examples?) • Named vs Anonymous systems • In named systems, process id is a part of the algorithm. • In anonymous systems, it is not so. All are equal. • (-) Symmetry breaking is often a challenge. • (+) Easy to switch one process by another with no side effect. Saves logN bits.

  28. Configuration • Vector of processor states (including outbufs, i.e., channels), one per processor, is a configuration of the system • Captures current snapshot of entire system: accessible processor states (local vars + incoming msg queues) as well as communication channels.

  29. Termination • For technical reasons, admissible executions are defined as infinite. • But often algorithms terminate. • To model algorithm termination, identify terminated states of processors: states which, once entered, are never left • Execution has terminated when all processors are terminated and no messages are in transit (in inbufs or outbufs)

  30. Finite State Machines for Modelling Distributed Algorithms • A Finite State Machine is a 5-tuple : • I : Set of Inputs • S : Set of States • S0 : Initial state • G : S X I -> S State Transfer Function • O : Set of outputs • Ç : S X I -> O Output Function

  31. FSMs • Moore FSM : Output is dependent on the current state • Mealy FSM : Output is dependent on the input and the current state. It usually has less states than Moore FSM • A Moore FSM example : Parity Checker

  32. Process FSMs Each process is a FSM. They execute the same FSM code but may be in different states.

  33. Our Implementation Environment for Distributed Algorithms • - Design the distributed algorithm as an FSM • - Implement each FSM as Solaris/Linux Threads • - Use my “thread IPC” module for communication

  34. The Unix fork • procid = fork() • Replicates calling process • Parent and child are identical except for the value of procid • Use procid to diverge parent and child: if (procid==0)do_child_processing else do_parent_processing

  35. Process Concept • An operating system executes a variety of programs: • Batch system – jobs • Time-shared systems – user programs or tasks • the terms job and process almost used interchangeably. • Process – a program in execution; process execution must progress in sequential fashion. • A process includes: • program counter • stack • data section

  36. UNIX Processes • Each process has its own address space • – Subdivided into text, data, & stack segment • – a.out file describes the address space • OS kernel creates descriptor to manage • process • Process identifier (PID): User handle for • the process (descriptor)

  37. Creating/Destroying Processes • UNIX fork() creates a process • – Creates a new address space • – Copies text, data, & stack into new adress space • – Provides child with access to open files • UNIX wait() allows a parent to wait for a • child to terminate • UNIX execva() allows a child to run a • new program

  38. Creating a UNIX Process • int pidValue; • ... • pidValue = fork(); /* Creates a child process */ • if(pidValue == 0) { • /* pidValue is 0 for child, nonzero for parent */ • /* The child executes this code concurrently with parent */ • childsPlay(…); /* A procedure linked into a.out */ • exit(0); • } • /* The parent executes this code concurrently with child */ • parentsWork(..); • wait(…); • ...

  39. Child Executes Something else • int pid; • ... • /* Set up the argv array for the child */ • ... • /* Create the child */ • if((pid = fork()) == 0) • { • /* The child executes its own absolute program */ • execve(childProgram.out, argv, 0); • /* Only return from an execve call if it fails */ • printf(“Error in the exec … terminating the child …”); • exit(0); • } • ... • wait(…); /* Parent waits for child to terminate */ • ...

  40. Example: Parent • #include <sys/wait.h> • int main (void) • { • if ((c=fork()) == 0){ /* This is the child process */ • execve("child",NULL,NULL); • exit(0); /* Should never get here, terminate */ • } • /* Parent code here */ • printf("Process[%d]: Parent in execution ...\n", getpid()); • sleep(2); • if(wait(NULL) > 0) /* Child terminating */ • printf("Process[%d]: Parent detects terminating child \n",c); • printf("Process[%d]: Parent terminating ...\n", getpid()); • }

  41. What is a thread? • process: • an address space with 1 or more threads executing within that address space, and the required system resources for those threads • a program that is running • thread: • a sequence of control within a process • shares the resources in that process

  42. Advantages and Drawbacks of Threads • Advantages: • the overhead for creating a thread is significantly less than that for creating a process • multitasking, i.e., one process serves multiple clients • switching between threads requires the OS to do much less work than switching between processes

  43. Drawbacks: • not as widely available as longer established features • writing multithreaded programs require more careful thought • more difficult to debug than single threaded programs • for single processor machines, creating several threads in a program may not necessarily produce an increase in performance (only so many CPU cycles to be had)

  44. Single and Multithreaded Processes

  45. User Threads • Thread management done by user-level threads library • Examples • - POSIX Pthreads • - Mach C-threads • - Solaris threads

  46. Kernel Threads • Supported by the Kernel • Examples • - Windows 95/98/NT/2000 • - Solaris • - Tru64 UNIX • - BeOS • - Linux

  47. Multithreading Models • Many-to-One • One-to-One • Many-to-Many • Many-to-One : • Many user-level threads mapped to single kernel thread. • Used on systems that do not support kernel threads.

  48. Many-to-One Model

  49. One-to-One • Each user-level thread maps to kernel thread. • Examples • - Windows 95/98/NT/2000 • - OS/2

  50. Pthreads • a POSIX standard (IEEE 1003.1c) API for thread creation and synchronization. • API specifies behavior of the thread library, implementation is up to development of the library. • Common in UNIX operating systems.

More Related