130 likes | 264 Views
MPI. Introduction to MPI. p processes, each with its own address space data explicitly partitioned and placed all interactions are two-sided SIMD model simple, portable, but can be complicated to program. Prototypes. send(void *sendbuf, int nelems, int dest_rank)
E N D
Introduction to MPI • p processes, each with its own address space • data explicitly partitioned and placed • all interactions are two-sided • SIMD model • simple, portable, but can be complicated to program
Prototypes • send(void *sendbuf, int nelems, int dest_rank) • receive(void *recvbuf, int nelems, int source_rank) • Processor 0: a=100; send(&a, 1, 1,);a=0; • Processor 1: receive(&a, 1, 0); printf(“%d\n”,a);
Blocking • Nonbuffered: Send does not return until matching receive executes. Likewise for receive. Idle, deadlock. • Buffered: Data copied into buffer and returns. Avoids idling at expense of copy. • Buffered sizes can have significant impact.
Non-Blocking • send and receive return before it is safe • programmer must ensure proper usage • both implementations are provided.
MPI • Standard library • C and Fortran • MPI_Init: Initialization • MPI_Finalize: Call at end • MPI_Comm_size: Communication domain (group of processes) • MPI_Comm_World: root process, includes all • MPI_Comm_size • MPI_Comm_rank: index of calling process • MPI_Send, MPI_Recv
Hello World #include <mpi.h> main(int argc, char* argv[]){ int npes, myrank; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &npes); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); printf(“From process %d out of %d, Hello World!\n”, myrank, npes); MPI_Finalize(); }
Java 5 Concurrency • java.util.concurrent • wait() and notify() insufficient in some instances
No Way • to back off from an attempt to acquire a lock that is already held • to give up waiting after some time • to cancel a lock after an interrupt • to alter semantics of lock (reentrancy, r/w protection, fairness) • No access control • Can’t acquire lock in one method and release in another (block structure)
java.util.concurrent • Executer: Task scheduling framework • Concurrent Collections • Atomic variables: single variables • Synchronizers: semaphores, mutexes, barriers, latches, exchangers • Locks: high performance lock implementation (timeouts, condition variables per lock, interruped threads) • Nanosecond –granularity timing: java.lang.System.nanotime()
Synchronizers • Barrier: Threads wait on each other to reach a common barrier point. Release condition is number of threads waiting. • CountDownLatch: Thread released when count to zero • Exchanger: Two threads exchange an object at a rendezvous point • Semaphore
try { //consumer thread while(currentBuffer != null) { takeFromBuffer(currentBuffer); if(currentBuffer.empty()) currentBuffer = exchanger.exchange(currentBuffer); } } catch(InterruptedException ex) {} try { //producer thread while(currentBuffer != null) { addFromBuffer(currentBuffer); if(currentBuffer.full()) currentBuffer = exchanger.exchange(currentBuffer); } } catch(InterruptedException ex) {}
class X { private final ReentrantLock lock = new ReentrantLock(); // ... public void m() { lock.lock(); // block until condition holds try { // ... method body } finally { lock.unlock() } } } import java.util.concurrent.atomic.AtomicInteger; class AtomicCounter { private AtomicInteger c = new AtomicInteger(0); public void increment() { c.incrementAndGet(); } public void decrement() { c.decrementAndGet(); } public int value() { return c.get(); }}