High Productivity Languages for Parallel Programming Compared to MPI

High Productivity Languages for Parallel Programming Compared to MPI Scott Spetka – SUNYIT and ITT Corp Haris Hadzimujic – SUNY Institute of Technology Stephen Peek – Binghamton University Christopher Flynn – Air Force Research Laboratory, Information Directorate HPC Users Group Conference Seattle, WA July 15 – 17, 2008

Introduction to Chapel, X10, MPI • Pub/Sub Case Study • Language Examples • Conclusion Outline

DoD HPCS • Improved Programmer Productivity • Last Chapel Release March 2008 version 0.775 – remote processing support http://chapel.cs.washington.edu/ • New X10 language report came out version 1.7 – June 18, 2008

Data Distribution - Global partitioned address Space • Communication Model – One-sided/two-sided • Synchronization – Sync variables (Chapel) Clocks (X10) Atomic Sections (Both) • Parallel Threads – Async, Futures (X10), Cobegin (Chapel) • Performance – Prototypes demonstrate features - 2010 Introduction to Chapel, X10, MPI

1 var n : int = 1000; 2 var A, B: [ 1 . . n ] float ; 3 forall i in 2 . . n−1 4 B( i ) = (A( i − 1) + A( i + 1 ) ) / 2 ; 1 var n : int = 1000; 2 var locN : int = n / numTasks ; 3 var A, B: [ 0 . . locN +1] float ; 4 var myItLo : int = 1 ; 5 var myItHi : int = locN ; 6 if ( iHaveLeftNeighbor ) then 7 send ( left , A( 1 ) ) ; 8 else 9 myItLo = 2 ; 10 if ( iHaveRightNeighbor ) { 11 send ( right , A( locN ) ) ; 12 recv ( right , A( locN + 1 ) ) ; 13 }e l s e 14 myItHi = locN−1; 15 if ( iHaveLeftNeighbor ) then 16 recv ( left , A( 0 ) ) ; 17 forall i in myItLo . . myItHi do 18 B( i ) = (A( i −1) + A( i +1 ) ) / 2 ; PGAS vs. Fragmented International Journal of High Performance Computing Applications, August 2007 B.L. Chamberlain, Cray D. Callahan, Microsoft H.P. Zima JPL, U of Vienna, Austria

Global vs Local View International Journal of High Performance Computing Applications, August 2007 B.L. Chamberlain, Cray D. Callahan, Microsoft H.P. Zima JPL, U of Vienna, Austria

Pub/Sub Model Pub/Sub Introduction Publisher - Publish XML documents Pubcatcher – Publication input to brokers Subscriber – Submit XPATH subscriptions Broker – Match subscriptions against pubs

Pub/Sub Model Pub/Sub Model - PGAS

Pub/Sub Model Pub/Sub Model - Fragmented

Chapel type elemType = int(32); config const numPublishers = 2, numBrokers = 2, bufferSize=12; const ProblemSpace: domain(1) distributed(Cyclic) = [0..bufferSize-1]; var buff: [ProblemSpace] elemType; var nextFreeSlot$: sync int = 1; var nextFullSlot$: sync int = 1; def main() { cobegin { coforall i in 1..numPublishers { publisher(i); } coforall i in 1..numBrokers { broker(i); } } }

Chapel def publisher(id: int) { var pub = infile.read(int); for slot in getNextFreeSlot() { writeln("Publisher:", id, " published:", pub, " in slot:",slot); buff(slot) = pub; sleep(3); pub = infile.read(int); } }

Chapel def getNextFreeSlot() { // Access the next free message queue slot while (1) { const locFree = nextFreeSlot$; // consume sync var const nextFree = (locFree + 1) % bufferSize; if (nextFree == nextFullSlot$.readXX()) { // we wrapped around so don't yield anything, but allow others to // continue by refilling the sync var with the same value nextFreeSlot$ = locFree; } else { nextFreeSlot$ = nextFree; // refill sync var with advanced value yield locFree; // yield the free slot that we grabbed } } }

X10 // Declaration of global one dimensional array that will be distributed // Cyclic distribution definition using region of A for distribution scope final static int [.] A = new int [[1:8]] (point[i]) { return i*10; }; final static dist d = dist.factory.cyclic(A.region); public static void main(String args[]) { System.out.println("\n\nTotal places: "+ place.MAX_PLACES + "\n"); System.out.println( "ID of the distribution: " + here + "\n"); finish ateach (final point p: d ) { System.out.println( "Execution place: "+ d[p] + " and value: " + A[p]); } subscription(1); subscription(2); } // end main

static void subscription(final int i) { foreach(point p : d) { async (d.distribution[p]) { switch (i) { case 1: if(A[p]>40) { A[p]=A[p]+1; System.out.println(“Location " + here + " value" + A[p]); } case 2: if(A[p]<40) { A[p]=A[p]-1; System.out.println(“Location " + here + " “value" + A[p]); } default: break; } // switch } // async } // foreach } // subscription X10

MPI //get attribute to determine if current process is to store data MPI_Attr_get(next_comm, NEXT, &next_store_ptr, &flag); MPI_Allreduce(next_store_ptr, &next_rank, 1, MPI_INT, MPI_MAX, next_comm); next_rank = next_rank % size;

MPI if (my_rank == next_rank){ MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status); if ((next_rank+1)<size){ *next_ptr = next_rank + 1; } else{ *next_ptr = next_rank + 2; } MPI_Attr_put(next_comm, NEXT, next_ptr); printf("stored on process %i\n", next_rank); MPI_Recv(&data_recv, 1, MPI_INT, status.MPI_SOURCE, status.MPI_TAG, MPI_COMM_WORLD, &status); data_store[count][0] = data_recv; data_store[count][1] = status.MPI_TAG; count++; }

Conclusion HPCS languages reduce time to solution Object Oriented – user-defined distributions, reductions, scans Global Synchronization One-sided communication Adding new tasks

Acknowledgements Bradford Chamberlain, Cray Igor Peshansky, IBM

High Productivity Languages for Parallel Programming Compared to MPI

High Productivity Languages for Parallel Programming Compared to MPI

Presentation Transcript

An Introduction to Parallel Programming with MPI

Advanced Parallel Programming with MPI

Parallel programming languages

Parallel Programming in MPI part 1

An Introduction to Parallel Programming with MPI

Parallel Programming Languages

Parallel Programming using MPI

Parallel Programming with MPI- Day 3

Parallel Programming with MPI

Parallel Programming in MPI Answer

Parallel Computing/Programming using MPI

MPI Parallel Programming

Introduction to Parallel Programming Using MPI (1)

An Introduction to Parallel Programming with MPI

Parallel Programming with MPI- Day 4

An Introduction to Parallel Programming with MPI

Parallel Programming with MPI- Day 2

An Introduction to Parallel Programming with MPI

Parallel Programming in MPI

Python and Ruby: Programming Languages Compared

Introduction to Parallel Programming with MPI

Parallel Programming in MPI part 2