80 likes | 102 Views
Message Passing Fundamentals. Self Test. Self Test. A shared memory computer has access to: the memory of other nodes via a proprietary high-speed communications network a directives-based data-parallel language a global memory space communication time. Self Test.
E N D
Message Passing Fundamentals Self Test
Self Test • A shared memory computer has access to: • the memory of other nodes via a proprietary high-speed communications network • a directives-based data-parallel language • a global memory space • communication time
Self Test • A domain decomposition strategy turns out not to be the most efficient algorithm for a parallel program when: • data can be divided into pieces of approximately the same size. • the pieces of data assigned to the different processes require greatly different lengths of time to process. • one needs the advantage of maintaining a single flow of control. • one must parallelize a finite differencing scheme.
Self Test • In the message passing approach: • serial code is made parallel by adding directives that tell the compiler how to distribute data and work across the processors. • details of how data distribution, computation, and communications are to be done are left to the compiler. • is not very flexible. • it is left up to the programmer to explicitly divide data and work across the processors as well as manage the communications among them.
Self Test • Total execution time does not involve: • computation time. • compiling time. • communications time. • idle time.
Self Test • One can minimize idle time by: • occupying a process with one or more new tasks while it waits for communication to finish so it can proceed on another task. • always using nonblocking communications. • never using nonblocking communications. • frequent use of barriers.
Message passing Domain decomposition Idle time Load balancing Directives-based data parallel language Distributed memory Shared memory Computation time Functional decomposition Communication time When each node has rapid access to its own local memory and access to the memory of other nodes via some sort of communications network. When multiple processor units share access to a global memory space via a high speed memory bus. Data are divided into pieces of approximately the same size, and then mapped to different processors. The problem is decomposed into a large number of smaller tasks and then the tasks are assigned to the processors as they become available. Serial code is made parallel by adding directives that tell the compiler how to distribute data and work across the processors. The programmer explicitly divides data and work across the processors as well as managing the communications among them. Dividing the work equally among the available processes. The time spent performing computations on the data. The time a process spends waiting for data from other processors. The time for processes to send and receive messages Matching Question
Course Problem • The initial problem implements a parallel search of an extremely large (several thousand elements) integer array. The program finds all occurrences of a certain integer, called the target, and writes all the array indices where the target was found to an output file. In addition, the program reads both the target value and all the array elements from an input file. • Using these concepts of parallel programming, write a description of a parallel approach to solving the problem described above. (No coding is required for this exercise.)