140 likes | 273 Views
OpenMP. Open Specifications for Multi Processing. OpenMP is an API used for multi-threaded, shared memory parallelism Compiler Directives Runtime Library Routines Environment Variables Portable Standardized Available on PSI and CITRIS. OpenMP allows for a higher level of abstraction
E N D
OpenMP Open Specifications for Multi Processing
OpenMP is an API used for multi-threaded, shared memory parallelism • Compiler Directives • Runtime Library Routines • Environment Variables • Portable • Standardized • Available on PSI and CITRIS
OpenMP allows for a higher level of abstraction • Easier to finesse a serial code into a parallel version via OpenMP • OpenMP pragmas ignored in serial compilation • Scoping of thread-safe data is simplified OpenMPvsPThreads
Start out executing the program with one master thread • Master thread forks worker threads • Worker threads die or suspend at end of parallel code Fork/Join Parallelism Image courtesy of http://www.llnl.gov/computing/tutorials/openMP/
for (i=0; i<max; i++) zero[i] = 0; • For loop must have a canonical shape for OpenMP to parallelize it • Necessary for run-time system to determine loop iterations • No premature exits from the loop allowed • ie. break, return, exit, goto statements Simple Parallelization
#pragmaomp parallel for for (i=0; i<max; i++) zero[i] = 0; • Pragmas help compiler optimize • Master thread creates additional threads, each with a separate execution context • All variables declared outside parallel forpragma are shared by default, except for loop index parallel for pragma
How many threads will OpenMP create? • Defined by OMP_NUM_THREADS environment variable • Set this variable to the maximum number of threads you want OpenMP to use Thread Creation
for (i = 0; i < height; i++) for (j = 0; j < width; j++) c[i][j] = 2; • Want to parallelize outer loop as well as inner • What’s the problem with placing a parallel for pragma above the outer loop? Private Variables
Need to declare j a private variable • Use a private clause to create a private copy of j for inside loop #pragmaomp parallel for private(j) for (i = 0; i < height; i++) for (j = 0; j < width; j++) c[i][j] = 2; • Value of j is undefined at start and exit of loop • What if we need to initialize a private variable? private Clause
firstprivate: private variables with initial values copied from the master thread’s copy • lastprivate: last sequential iteration of the loop is copied into master thread’s copy of variable private Variances
Can help OpenMP decide how to handle parallelism schedule(type [,chunk]) • Types • Static – Iterations divided into size chunk, if specified, and statically assigned to threads • Dynamic – Iterations divided into size chunk, if specified, and dynamically scheduled among threads schedule clause
Can indicate an entire block of code to execute in parallel • Use #pragmaomp parallel before a single line or block of code enclosed by curly braces • All threads, including the master thread, will execute everything in the block • Reduces overhead by forking once for a set of parallel chunks General Parallelism
When updating a shared variable, may need to do so atomically inti=0; #pragmaomp parallel { : i++; : } • Thread might be swapped out after reading i value but before storing it • Use the atomic directive to ensure that a thread cannot be swapped out before completion of task #pragmaomp atomic i++; atomic directive
Useful when entering sections of code that are not thread-safe (ie. I/O) • Place single directives around the block of code that only one thread should perform • Other threads wait at end of single block for the executing thread to finish single directive