Multithreaded and Distributed Programming -- Classes of Problems

Multithreaded and Distributed Programming -- Classes of Problems ECEN5053 Software Engineering of Distributed Systems University of Colorado Foundations of Multithreaded, Parallel, and Distributed Programming, Gregory R. Andrews, Addison-Wesley, 2000

The Essence of Multiple Threads -- review • Two or more processes that work together to perform a task • Each process is a sequential program • One thread of control per process • Communicate using shared variables • Need to synchronize with each other, 1 of 2 ways • Mutual exclusion • Condition synchronization ECEN5053 SW Eng of Distributed Systems, University of Colorado

Opportunities & Challenges • What kinds of processes to use • How many parts or copies • How they should interact • Key to developing a correct program is to ensure the process interaction is properly synchronized ECEN5053 SW Eng of Distributed Systems, University of Colorado

Focus in this course • Imperative programs • Programmer has to specify the actions of each process and how they communicate and synchronize. (Java, Ada) • Declarative programs (not our focus) • Written in languages designed for the purpose of making synchronization and/or concurrency implicit • Require machine to support the languages, for example, “massively parallel machines.” • Asynchronous process execution • Shared memory, distributed memory, networks of workstations (message-passing) ECEN5053 SW Eng of Distributed Systems, University of Colorado

Multiprocessing monkey wrench • The solutions we addressed last semester presumed a single CPU and therefore the concurrent processes share coherent memory • A multiprocessor environment with shared memory introduces cache and memory consistency problems and overhead to manage it. • A distributed-memory multiprocessor/multicomputer/network environment has additional issues of latency, bandwidth, administration, security, etc. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Recall from multiprogram systems • A process is a sequential program that has its own thread of control when executed • A concurrent program contains multiple processes so every concurrent program has multiple threads, one for each process. • Multithreaded usually means a program contains more processes than there are processors to execute them • A multithreaded software system manages multiple independent activities ECEN5053 SW Eng of Distributed Systems, University of Colorado

Why write as multithreaded? • To be cool  (wrong reason) • Sometimes, it is easier to organize the code and data as a collection of processes than as a single huge sequential program • Each process can be scheduled and executed independently • Other applications can continue to execute “in the background” ECEN5053 SW Eng of Distributed Systems, University of Colorado

Many applications, 5 basic paradigms • Iterative parallelism • Recursive parallelism • Producers and consumers (pipelines) • Clients and servers • Interacting peers Each of these can be accomplished in a distributed environment. Some can be used in a single CPU environment. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Iterative parallelism • Example? • Several, often identical processes • Each contains one or more loops • Therefore each process is iterative • They work together to solve a single program • Communicate and synchronize using shared variables • Independent computations – disjoint write sets ECEN5053 SW Eng of Distributed Systems, University of Colorado

Recursive parallelism • One or more independent recursive procedures • Recursion is the dual of iteration • Procedure calls are independent – each works on different parts of the shared data • Often used in imperative languages for • Divide and conquer algorithms • Backtracking algorithms (e.g. tree-traversal) • Used to solve combinatorial problems such as sorting, scheduling, and game playing • If too many recursive procedures, we prune. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Producers and consumers • One-way communication between processes • Often organized into a pipeline through which info flows • Each process is a filter that consumes the output of its predecessor and produces output for its successor • That is, a producer-process computes and outputs a stream of results • Sometimes implemented with a shared bounded buffer as the pipe, e.g. Unix stdin and stdout • Synchronization primitives: flags, semaphores, monitors ECEN5053 SW Eng of Distributed Systems, University of Colorado

Clients & Servers • Producer/consumer -- one-way flow of information • independent processes with own rates of progress • Client/server relationship is most common pattern • Client process requests a service & waits for reply • Server repeatedly waits for a request; then acts upon it and sends a reply. • Two-way flow of information ECEN5053 SW Eng of Distributed Systems, University of Colorado

Distributed “procedures” and “calls” • Client and server relationship is the concurrent programming analog of the relationship between the caller of a subroutine and the subroutine itself. • Like a subroutine that can be called from many places, the server has many clients. • Each client request must be handled independently • Multiple requests might be handled concurrently ECEN5053 SW Eng of Distributed Systems, University of Colorado

Common example • Common example of client/server interactions in operating systems, OO systems, networks, databases, and others -- reading and writing a data file. • Assume file server module provides 2 ops: read and write; client process calls one or other. • Single CPU or shared-memory system: • File server implemented as set of subroutines and data structures that represent files • Interaction between client process and a file typically implemented by subroutine calls ECEN5053 SW Eng of Distributed Systems, University of Colorado

Client/Server example • If the file is shared • Probably must be written to by at most one client process at a time • Can safely be read concurrently by multiple clients • Example of what is called the readers/writers problem ECEN5053 SW Eng of Distributed Systems, University of Colorado

Readers/Writers -- many facets • Has a classic solution using mutexes (in chapter 2 last semester) when viewed as a mutual exclusion problem • Can also be solved with • a condition synchronization solution • different scheduling policies • Distributed system solutions include • with encapsulated database • with replicated files • just remote procedure calls & local synchronization • just rendezvous ECEN5053 SW Eng of Distributed Systems, University of Colorado

Consider a query on the WWW • A user opens a new URL within a Web browser • The Web browser is a client process that executes on a user’s machine. • The URL indirectly specifies another machine on which the Web page resides. • The Web page itself is accessed by a server process that executes on the other machine. • May already exist; may be created • Reads the page specified by the URL • Returns it to the client’s machine • Add’l server processes may be visited or created at intermediate machines along the way ECEN5053 SW Eng of Distributed Systems, University of Colorado

Clients/Servers -- on same or separate • Clients are processes regardless of # machines • Server • On a shared-memory machine is a collection of subroutines • With a single CPU, programmed using • mutual exclusion to protect critical sections • condition synchronization to ensure subroutines are executed in appropriate orders • Distributed-memory or network -- processes executing on different machine than clients • Often multithreaded with one thread per client ECEN5053 SW Eng of Distributed Systems, University of Colorado

Communication in client/server app • Shared memory -- • servers as subroutines; • use semaphores or monitors for synchronization • Distributed -- • servers as processes • communicate with clients using • message passing • remote procedure call (remote method inv.) • rendezvous ECEN5053 SW Eng of Distributed Systems, University of Colorado

Interacting peers • Occurs in distributed programs, not single CPU • Several processes that accomplish a task • executing the copies of same code (hence, “peers”) • exchanging messages • example: distributed matrix multiplication • Used to implement • Distributed parallel programs including distributed versions of iterative parallelism • Decentralized decision making ECEN5053 SW Eng of Distributed Systems, University of Colorado

Among the 5 paradigms are certain characteristics common to distributed environments.Distributed memoryProperties of parallel applicationsConcurrent computation

Distributed memory implications • Each processor can access only its own local memory • Program cannot use global variables • Every variable must be local to some process or procedure and can be accessed only by that process or procedure • Processes have to use message passing to communicate with each other ECEN5053 SW Eng of Distributed Systems, University of Colorado

Example of a parallel application • Remember concurrent matrix multiplication in a shared memory environment -- last semester? Sequential solution first: for [i = 0 to n-1] { for [j = 0 to n-1] { # compute inner product of a[i,*] and b[*, j] c[i, j] = 0.0; for [k = 0 to n-1] c[i, j] = c[i, j] + a[i, k]* b[k, j]; } } ECEN5053 SW Eng of Distributed Systems, University of Colorado

Properties of parallel applications • Two operations can be executed in parallel if they are independent. • Read set contains variables it reads but does not alter • Write set contains variables it alters (and possibly also reads) • Two operations are independent if the write set of each is disjoint from both the read and write sets of the other. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Concurrent computation Computing rows of result-matrix in parallel. cobegin [i = 0 to n-1] { for [j = 0 to n-1 { c[i, j] = 0.0; for [k = 0 to n-1] c[i, j] = c[i, j] + a[i, k] * b[k, j]; } } # coend ECEN5053 SW Eng of Distributed Systems, University of Colorado

Differences: sequential vs. concurrent • Syntactic: • cobegin is used in place of for in the outermost loop • Semantic: • cobegin specifies that its body should be executed concurrently -- at least conceptually -- for each value of index i. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Previous example implemented matrix multiplication using shared variables • Now -- two ways using message passing as means of communication • 1. Coordinator process & array of independent worker processes • 2. Workers are peer processes that interact by means of a circular pipeline ECEN5053 SW Eng of Distributed Systems, University of Colorado

Worker 0 Worker n-1 Results data data Coordinator ... Worker n-1 Worker 0 Peers ECEN5053 SW Eng of Distributed Systems, University of Colorado

Assume n processors for simplicity • Use an array of n worker processes, one worker on each processor, each worker computes one row of the result matrix process worker[i = 0 to n-1] { double a[n]; # row i of matrix a double b[n,n]; # all of matrix b double c[n]; # row i of matrix c receive initial values for vector a and matrix b; for [j = 0 to n-1] { c[j] = 0.0; for [k = 0 to n-1] c[j] = c[j + a[k] * b[k, j]; } send result c to coord} ECEN5053 SW Eng of Distributed Systems, University of Colorado

Aside -- if not standalone: • The source matrices might be produced by a prior computation and the result matrix might be input to a subsequent computation. • Example of distributed pipeline. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Role of coordinator • Initiates the computation and gathers and prints the results. • First sends each worker the appropriate row of a and all of b. • Waits to receive row of c from every worker. ECEN5053 SW Eng of Distributed Systems, University of Colorado

process coordinator { #source matrix a, b, and c are declared initialize a and b; for [i = 0 to n-1] { send row i of a to worker [i]; send all of b to worker [i]; } for [i = 0 to n-1] receive row i of c from worker [i]; print results which are now in matrix c; } ECEN5053 SW Eng of Distributed Systems, University of Colorado

Message passing primitives • Send packages up a message and transmits it to another process • Receive waits for a message from another process and stores it in local variables. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Peer approach Each worker has one row of a & is to compute one row of c Each worker has only one column of b at a time instead of the entire matrix Worker i has column i of matrix b. With this much source data, worker i can compute only the result for c[i, i]. For worker i to compute all of row i of matrix c, it must acquire all columns of matrix b. We circulate the columns of b among the worker processes via the circular pipeline Each worker executes a series of rounds in which it sends its column of b to the next worker and receives a different column of b from the previous worker ECEN5053 SW Eng of Distributed Systems, University of Colorado

See handout • Each worker executes the same algorithm • Communicates with other workers in order to compute its part of the desired result. • In this case, each worker communicates with just two neighbors • In other cases of interacting peers, each worker communicates with all the others. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Worker algorithm Process worker [I = 0 to n-1] { double a[n]; #row i of matrix a double b[n]; #one column of matrix b double c[n]; #row i of matrix c double sum = 0.0; # storage for inner products int nextCol = i; # next column of results ECEN5053 SW Eng of Distributed Systems, University of Colorado

Worker algorithm (cont.) receive row i of matrix a and column i of matrix b; #compute c[i,i] = a[i,*] x b[*,i] for [k = 0 to n-1] sum = sum + a[k] * b[k]; c[nextCol] = sum; # circulate columns and compute rest of c[i,*] for [j = 1 to n-1] { send my column of b to next worker; receive a new column of b from previous worker ECEN5053 SW Eng of Distributed Systems, University of Colorado

Worker algorithm (cont. 2) sum = 0.0; for [k = 0 to n-1] sum = sum + a[k] * b[k]; if (nextCol == 0) nextCol = n-1; else nextCol = nextCol – 1; c[nextCol] = sum; } send result vector c to coordinator process; } ECEN5053 SW Eng of Distributed Systems, University of Colorado

Comparisons • In first program, values of matrix b are replicated • In second, each has one row of a and one column of b at any point in time - • First requires more memory but executes faster. • This is a classic time/space tradeoff. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Summary • Concurrent programming paradigms in a shared-memory environment • Iterative parallelism • Recursive parallelism • Producers and consumers • Concurrent programming paradigms in a distributed-memory environment • Client/server • Interacting peers ECEN5053 SW Eng of Distributed Systems, University of Colorado

Shared-memory programming ECEN5053 SW Eng of Distributed Systems, University of Colorado

Shared-Variable Programming • Frowned on in sequential programs, although convenient (“global variables”) • Absolutely necessary in concurrent programs • Must communicate to work together ECEN5053 SW Eng of Distributed Systems, University of Colorado

Need to communicate • Communication fosters need for synchronization • Mutual exclusion – need to not access shared data at the same time • Condition synchronization – one needs to wait for another • Communicate in distributed environment via messages, remote procedure call, or rendezvous ECEN5053 SW Eng of Distributed Systems, University of Colorado

Some terms • State – values of the program variables at a point in time, both explicit and implicit. Each process in a program executes independently and, as it executes, examines and alters the program state. • Atomic actions -- A process executes sequential statements. Each statement is implemented at the machine level by one or more atomic actions that indivisibly examine or change program state. • Concurrent program execution interleaves sequences of atomic actions. A history is a trace of a particular interleaving. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Terms -- continued • The next atomic action in any ONE of the processes could be the next one in a history. So there are many ways actions can be interleaved and conditional statements allow even this to vary. • The role of synchronization is to constrain the possible histories to those that are desirable. • Mutual exclusion combines atomic actions into sequences of actions called critical sections where the entire section appears to be atomic. ECEN5053 SW Eng of Distributed Systems, University of Colorado

Terms – continued further • Property of a program is an attribute that is true of every possible history. • Safety – never enters a bad state • Liveness – the program eventually enters a good state ECEN5053 SW Eng of Distributed Systems, University of Colorado

How can we verify? • How do we demonstrate a program satisfies a property? • A dynamic execution of a test considers just one possible history • Limited number of tests unlikely to demonstrate the absence of bad histories • Operational reasoning -- exhaustive case analysis • Assertional reasoning – abstract analysis • Atomic actions are predicate transformers ECEN5053 SW Eng of Distributed Systems, University of Colorado

Assertional Reasoning • Use assertions to characterize sets of states • Allows a compact representation of states and their transformations • More on this later in the course ECEN5053 SW Eng of Distributed Systems, University of Colorado

Warning • We must be wary of dynamic testing alone • it can reveal only the presence of errors, not their absence. • Concurrent and distributed programs are difficult to test & debug • Difficult (impossible) to stop all processes at once in order to examine their state! • Each execution in general will produce a different history ECEN5053 SW Eng of Distributed Systems, University of Colorado

Why synchronize? • If processes do not interact, all interleavings are acceptable. • If processes do interact, only some interleavings are acceptable. • Role of synchronization: prevent unacceptable interleavings • Combine fine-grain atomic actions into coarse-grained composite actions (we call this ....what?) • Delay process execution until program state satisfies some predicate ECEN5053 SW Eng of Distributed Systems, University of Colorado

Multithreaded and Distributed Programming -- Classes of Problems

Multithreaded and Distributed Programming -- Classes of Problems

Presentation Transcript

Multithreaded Programming in Java

Multithreaded and Distributed Programming – How Distributed Programs Communicate

Multithreaded and Distributed Programming – How Distributed Programs Communicate

P4: Multithreaded Programming

Chapter 4: Multithreaded Programming

Chapter 4. Multithreaded Programming

Chapter 4: Multithreaded Programming

Exercise #3: Multithreaded Programming

Multithreaded Programming using Java Threads

Chapter 4 MultiThreaded Programming

Multithreaded Programming using Java Threads

Multithreaded Programming

Chapter 4: Multithreaded Programming

Multithreaded Programming in Java

Chapter 4: Multithreaded Programming

Chapter 4: Multithreaded Programming

Multithreaded Programming using Java Threads

Multithreaded Programming in Java

Multithreaded Programming in Java

Chapter 4: Multithreaded Programming

Chapter 4 Multithreaded Programming

Chapter 4: Multithreaded Programming