470 likes | 488 Views
Lecture 3 Posix Threads. CSCE 713 Advanced Computer Architecture. Topics Pthreads Readings. January 17, 2012. Overview. Last Time Finish Slides from Lecture 1 NP-Completeness and the Dwarves Posix Pthreads Readings for today
E N D
Lecture 3Posix Threads CSCE 713 Advanced Computer Architecture • Topics • Pthreads • Readings January 17, 2012
Overview • Last Time • Finish Slides from Lecture 1 • NP-Completeness and the Dwarves • PosixPthreads • Readings for today • Chapter 23 Threads from Network Programming vol 1 2nded. Richard Stevens • Chapters 11 and 12 Advanced Unix Programming 2nd ed. Richard Stevens http://www.kohala.com/start/ • CSE Department Unix machines: /class/csce713-006 • New • Lawrence Livermore National Labs Pthreads tutorial • Hello.c, hello -args2.c • PosixPthreads • Next time performance evaluation, barriers and MPI intro
CSAPP – Bryant O’Hallaron • . Computer Systems: A Programmers Perspective, Bryant and O’Hallaron
Books by Richard Stevens • UNIX Network Programming, Volume 2, Second Edition: Interprocess Communications, Prentice Hall, 1999. • UNIX Network Programming, Volume 1, Second Edition: Networking APIs: Sockets and XTI, Prentice Hall, 1998. • TCP/IP Illustrated, Volume 3: TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols, Addison-Wesley, 1996. • TCP/IP Illustrated, Volume 2: The Implementation, Addison-Wesley, 1995. • TCP/IP Illustrated, Volume 1: The Protocols, Addison-Wesley, 1994. • Advanced Programming in the UNIX Environment, Addison-Wesley, 1992. • UNIX Network Programming, Prentice Hall, 1990.
Network Programming vol 1 2nd edition • Chapter 23. not Chapter 25; Network Progvol 1 not APUE • 23.1 Introduction • 23.2 Basic Thread Functions: Creation and Termination • 23.3 str_cli Function Using Threads • 23.4 TCP Echo Server Using Threads • 23.5 Thread-Specific Data • 23.6 Web Client and Simultaneous Connections (Cont.) • 23.7 Mutexes: Mutual Exclusion • 23.8 Condition Variables • 23.9 Web Client and Simultaneous Connections (Cont.) • 23.10 Summary • Note chapters 11 and 12 of APUE2 (Adv. Prog. Unix Env. 2nd edition) might be better
POSIX Threads Programming (LLNL) • Author: Blaise Barney, Lawrence Livermore National Laboratory • /usr/class/csce713-006 • Code/LLNL • hello.c - • hello_arg2.c – • !? hello_arg3.c - • join.c • condvar.c • dotprod_serial.c • dotprod_mutex.c • stack.c
Example • /* http://en.wikipedia.org/wiki/POSIX_Threads*/ • #include <pthread.h> • #include <stdio.h> • #include <stdlib.h> • #include <assert.h> • #define NUM_THREADS 5 /* http://en.wikipedia.org/wiki/POSIX_Threads*/
void *TaskCode(void *argument) • { • inttid; • tid = *((int *) argument); • printf("Hello World! It's me, thread %d!\n", tid); • /* optionally: insert more useful stuff here */ • return NULL; • } • int main (intargc, char *argv[]) • { • pthread_t threads[NUM_THREADS]; • intthread_args[NUM_THREADS]; • intrc, i; /* http://en.wikipedia.org/wiki/POSIX_Threads*/
/* create all threads */ • for (i=0; i<NUM_THREADS; ++i) { • thread_args[i] = i; • printf("In main: creating thread %d\n", i); • rc = pthread_create(&threads[i], NULL, TaskCode, (void *) &thread_args[i]); • assert(0 == rc); • } • /* wait for all threads to complete */ • for (i=0; i<NUM_THREADS; ++i) { • rc = pthread_join(threads[i], NULL); • assert(0 == rc); • } • exit(EXIT_SUCCESS); • } /* http://en.wikipedia.org/wiki/POSIX_Threads*/
Designing Threaded Programs • What type of parallel programming model to use? • Problem partitioning • Load balancing • Communications • Data dependencies • Synchronization and race conditions • Memory issues • I/O issues • Program complexity • Programmer effort/costs/time https://computing.llnl.gov/tutorials/pthreads/
Scheduling Independent Routines https://computing.llnl.gov/tutorials/pthreads/
Programs suitable for Multithreading • Work that can be executed, or data that can be operated on, by multiple tasks simultaneously • Block for potentially long I/O waits • Use many CPU cycles in some places but not others • Must respond to asynchronous events • Some work is more important than other work (priority interrupts) https://computing.llnl.gov/tutorials/pthreads/
Common models for threaded programs • Manager/worker: • a single thread, the manager assigns work to other threads, the workers. Typically, the manager handles all input and parcels out work to the other tasks. At least two forms of the manager/worker model are common: static worker pool and dynamic worker pool. • Pipeline: • a task is broken into a series of suboperations, each of which is handled in series, but concurrently, by a different thread. • Peer: • similar to the manager/worker model, but after the main thread creates other threads, it participates in the work. https://computing.llnl.gov/tutorials/pthreads/
Shared Memory Model https://computing.llnl.gov/tutorials/pthreads/
Thread-safeness • Thread-safeness: in a nutshell, refers an application's ability to execute multiple threads simultaneously without "clobbering" shared data or creating "race" conditions. • For example, suppose that your application creates several threads, each of which makes a call to the same library routine: • This library routine accesses/modifies a global structure or location in memory. • As each thread calls this routine it is possible that they may try to modify this global structure/memory location at the same time. • If the routine does not employ some sort of synchronization constructs to prevent data corruption, then it is not thread-safe. https://computing.llnl.gov/tutorials/pthreads/
The Pthreads API https://computing.llnl.gov/tutorials/pthreads/
Creating and Terminating Threads https://computing.llnl.gov/tutorials/pthreads/
Thread Termination https://computing.llnl.gov/tutorials/pthreads/
Mutex Variables • pthread_mutex_lock (mutex) • pthread_mutex_trylock (mutex) • pthread_mutex_unlock (mutex) https://computing.llnl.gov/tutorials/pthreads/
The pthread_mutex_lock() routine is used by a thread to acquire a lock on the specified mutex variable. If the mutex is already locked by another thread, this call will block the calling thread until the mutex is unlocked. https://computing.llnl.gov/tutorials/pthreads/
pthread_mutex_trylock() will attempt to lock a mutex. However, if the mutex is already locked, the routine will return immediately with a "busy" error code. This routine may be useful in preventing deadlock conditions, as in a priority-inversion situation. • pthread_mutex_unlock() will unlock a mutex if called by the owning thread. Calling this routine is required after a thread has completed its use of protected data if other threads are to acquire the mutex for their work with the protected data. An error will be returned if: • If the mutex was already unlocked • If the mutex is owned by another thread https://computing.llnl.gov/tutorials/pthreads/
Hello.c from LLNP tutorial • #include <pthread.h> • #include <stdio.h> • #include <stdlib.h> • #include <assert.h> • #define NUM_THREADS 5 • void *TaskCode(void *argument) • { • inttid; • tid = *((int *) argument); • printf("Hello World! It's me, thread %d!\n", tid); • /* optionally: insert more useful stuff here */ • return NULL; • } https://computing.llnl.gov/tutorials/pthreads/ https://computing.llnl.gov/tutorials/pthreads/
int main (intargc, char *argv[]) • { • pthread_t threads[NUM_THREADS]; • intthread_args[NUM_THREADS]; • intrc, i; • /* create all threads */ • for (i=0; i<NUM_THREADS; ++i) { • thread_args[i] = i; • printf("In main: creating thread %d\n", i); • rc = pthread_create(&threads[i], NULL, TaskCode, (void *) &thread_args[i]); • assert(0 == rc); • } • /* wait for all threads to complete */ • for (i=0; i<NUM_THREADS; ++i) { • rc = pthread_join(threads[i], NULL); • assert(0 == rc); • } • exit(EXIT_SUCCESS); • } https://computing.llnl.gov/tutorials/pthreads/
Hello output - nondeterminism of time • saluda> ./thread1 • In main: creating thread 0 • In main: creating thread 1 • Hello World! It's me, thread 0! • Hello World! It's me, thread 1! • In main: creating thread 2 • In main: creating thread 3 • Hello World! It's me, thread 2! • In main: creating thread 4 • Hello World! It's me, thread 3! • Hello World! It's me, thread 4! https://computing.llnl.gov/tutorials/pthreads/
Hello_args2.c - declarations • #include <pthread.h> • #include <stdio.h> • #include <stdlib.h> • #define NUM_THREADS 8 • char *messages[NUM_THREADS]; • structthread_data • { • intthread_id; • int sum; • char *message; • }; • structthread_datathread_data_array[NUM_THREADS]; https://computing.llnl.gov/tutorials/pthreads/
Hello_args2.c - thread function • void *PrintHello(void *threadarg) • { • inttaskid, sum; • char *hello_msg; • structthread_data *my_data; • sleep(1); • my_data = (structthread_data *) threadarg; • taskid = my_data->thread_id; • sum = my_data->sum; • hello_msg = my_data->message; • printf("Thread %d: %s Sum=%d\n", taskid, hello_msg, sum • ); • pthread_exit(NULL); • } https://computing.llnl.gov/tutorials/pthreads/
Hello_args2.c - Main • int main(intargc, char *argv[]) • { • pthread_t threads[NUM_THREADS]; • int *taskids[NUM_THREADS]; • intrc, t, sum; • sum=0; • messages[0] = "English: Hello World!"; • messages[1] = "French: Bonjour, le monde!"; • messages[2] = "Spanish: Hola al mundo"; • messages[3] = "Klingon: NuqneH!"; • messages[4] = "German: Guten Tag, Welt!"; • messages[5] = "Russian: Zdravstvytye, mir!"; • messages[6] = "Japan: Sekai e konnichiwa!"; • messages[7] = "Latin: Orbis, tesaluto!"; • for(t=0;t<NUM_THREADS;t++) { • sum = sum + t; • thread_data_array[t].thread_id = t; • thread_data_array[t].sum = sum; • thread_data_array[t].message = messages[t]; • printf("Creating thread %d\n", t); • rc = pthread_create(&threads[t], NULL, PrintHello, (void • *) • &thread_data_array[t]); https://computing.llnl.gov/tutorials/pthreads/
Hello_args2.c – main loop • for(t=0;t<NUM_THREADS;t++) { • sum = sum + t; • thread_data_array[t].thread_id = t; • thread_data_array[t].sum = sum; • thread_data_array[t].message = messages[t]; • printf("Creating thread %d\n", t); • rc = pthread_create(&threads[t], NULL, PrintHello, (void *) • &thread_data_array[t]); • if (rc) { • printf("ERROR; return code from pthread_create() is %d\n", rc); • exit(-1); • } • } • pthread_exit(NULL); • } https://computing.llnl.gov/tutorials/pthreads/
Mutex excerpt – dotprod_mutex.c • … /* excerpt from code of thread_function */ • mysum= 0; • for (i=start; i<end ; i++) • { • mysum += (x[i] * y[i]); • } • /* • Lock a mutex prior to updating the value in the shared • structure, and unlock it upon updating. • */ • pthread_mutex_lock (&mutexsum); • dotstr.sum += mysum; • pthread_mutex_unlock (&mutexsum); • pthread_exit((void*) 0); https://computing.llnl.gov/tutorials/pthreads/
Mixing MPI with Pthreads: • Design: • Each MPI process typically creates and then manages N threads, where N makes the best use of the available CPUs/node. • Finding the best value for N will vary with the platform and your application's characteristics. • For IBM SP systems with two communication adapters per node, it may prove more efficient to use two (or more) MPI tasks per node. • In general, there may be problems if multiple threads make MPI calls. The program may fail or behave unexpectedly. If MPI calls must be made from within a thread, they should be made only by one thread. https://computing.llnl.gov/tutorials/pthreads/
LLNL MPI/Pthreads Examples • Compiling: • Use the appropriate MPI compile command for the platform and language of choice • Be sure to include the required Pthreads flag as shown in the Compiling Threaded Programs section. • An example code that uses both MPI and Pthreads is available below. The serial, threads-only, MPI-only and MPI-with-threads versions demonstrate one possible progression. • Serial • Pthreads only • MPI only • MPI with pthreads • makefile (for IBM SP) https://computing.llnl.gov/tutorials/pthreads/
apue.2e/threads • apue.2e/threads/ • apue.2e/threads/badexit2.c • apue.2e/threads/cleanup.c • apue.2e/threads/condvar.c • apue.2e/threads/exitstatus.c • apue.2e/threads/linux.mk • apue.2e/threads/macos.mk • apue.2e/threads/mutex1.c • apue.2e/threads/mutex2.c • apue.2e/threads/mutex3.c • apue.2e/threads/rwlock.c • apue.2e/threads/threadid.c http://www.apuebook.com/
apue.2e/threadctl • apue.2e/threadctl/detach.c • apue.2e/threadctl/getenv1.c • apue.2e/threadctl/getenv2.c • apue.2e/threadctl/getenv3.c • apue.2e/threadctl/linux.mk • apue.2e/threadctl/macos.mk • apue.2e/threadctl/suspend.c • apue.2e/threadctl/timeout.c http://www.apuebook.com/
Ten Questions with David Butenhof about Parallel Programming and POSIX Threads • Michael: Are there any specific tools you would like to recommend to people who want to program in POSIX Threads? IDEs? Editors? Debuggers? Profilers? Correctness Tools? Any others? • David: Tru64’s ladebug and Visual Threads were awesome tools, and ATOM allowed constructing simple analyzers. Nobody else really has anything that comprehensive, despite various gdb add-ons. (Then again, Intel has ladebug… but hasn’t really done anything with it.) Totalview is a great portable thread debugging environment, although the GUI is a bit “opaque”. http://www.thinkingparallel.com
Debugging - gdb • 4.10 Debugging Programs with Multiple Threads • gdb provides these facilities for debugging multi-thread programs: • automatic notification of new threads • `thread threadno', a command to switch among threads • `info threads', a command to inquire about existing threads • `thread apply [threadno] [all] args', a command to apply a command to a list of threads • thread-specific breakpoints
`set print thread-events', which controls printing of messages on thread start and exit. • `set libthread-db-search-path path', which lets the user specify which libthread_db to use if the default choice isn't compatible with the program. • Warning: These facilities are not yet available on every gdb configuration where the operating system supports threads. If your gdb does not support threads, these commands have no effect. For example, a system without thread support shows no output from `info threads', and always rejects the thread command, like this: (gdb) info threads (gdb) thread 1 Thread ID 1 not known. Use the "info threads" command to see the IDs of currently known threads.
http://numericalmethods.eng.usf.edu Gauss-Seidel Method Algorithm A set of n equations and n unknowns: If: the diagonal elements are non-zero Rewrite each equation solving for the corresponding unknown ex: First equation, solve for x1 Second equation, solve for x2 . . . . . .
http://numericalmethods.eng.usf.edu Gauss-Seidel Method Algorithm Rewriting each equation From Equation 1 From equation 2 From equation n-1 From equation n
http://numericalmethods.eng.usf.edu Gauss-Seidel Method Algorithm General Form of each equation
Barriers • Synchronize threads at a point e.g., after an iteration. • Implementation of barrier • Mutex to control access to int variable threads_finished • threads_fini = num_threads; /* initial value at start of iteration */ • As thread reaches boundary • Grab mutex • decrement threads_fini • If count == 0 start next iteration • Free mutex • 3. Busy wait while (threads_fini != 0) ;
BusyWait • Problems • Solutions • Semahore sets (next time)
Threads programming Assignment • Matrix addition (embarassingly parallel) • Versions • Sequential • Sequential with blocking factor • Sequential Read without conversions • Multi threaded passing number of threads as command line argument (args.c code should be distributed as an example) • Plot of several runs • Next time
Time in the Computer World • . Computer Systems: A Programmers Perspective, Bryant and O’Hallaron
Time Command • Real • User • System
#include <sys/times.h> • structtms • clock_ttms_utime; /* user time * / • clock_ttms_s time; /* system time * / • clock_ttms_cutime; /* user time of reaped children */ • clock_ttms_cstime; /* system time of reaped children */ • } ; • clock_t times(structtms *buf); • Returns: number of clock ticks elapsed since system started Computer Systems: A Programmers Perspective, Bryant and O’Hallaron