CSE 160 - Lecture 15

CSE 160 - Lecture 15 Introduction to Threads, Synchronization and Mutual Exclusion

Heavyweight Processes • Complete stand-alone programs • Code segment • Data Segment • Static data • Heap • Malloc’ed data • Stack • Registers

How can two heavyweight processed communicate Process 1 Process 2 myshmPtr myshmPtr Shared Memory Segment or Communication Socket

Shared Memory Segment • Only a single cpu or multiprocessor shared memory • A “named” segment of memory that processes attach to • shmat() function call for Unix • Processes are given pointers to the beginning of the shared memory segment • Structure of the segment contents are not specified

Concurrent Access Problem Shared Memory Segment Process 2 Process 1 ptrY = myshmPtr + sizeof (int); *ptrY = 1; if (*ptrY > 0) *ptrY --; ptrY = myshmPtr + sizeof (int); *ptrY = 1; if (ptrY > 0) *ptrY --; myshmPtr myshmPtr int x; int y; int z; What value is y after these programs execute?

Mutual Exclusion • In general, the temporal (time) order in which processes execute code relative to each other is unknown • Portions of code that modify shared variables are called critical sections • Access to critical shared variables must regulated so that only one process at a time may have access to the section; • This is called serialization of access or mutual exclusion

Implementing Mutual Exclusion • Spin Locks While (lock == 1) /* wait */ ; lock = 1; <critical section> lock = 0; • Busy waiting is inefficient • Naïve implementation has pitfalls (how?)

Atomic Operations • Implementing locks, semaphores, monitors requires atomic building blocks load r0, <lock> cmp r0, 0 jne again: add r0, 1 store <lock>, r0 Again: A second process could be swapped in. (Simultaneously in an SMP) Need to make sure all operations complete without interruption (atomically)

Test and Set • CPU designers recognize this need and have special hardware instructions • test and set • test for zero, set if not zero • fetch and increment • fetch location and add one

Semaphores • Introduced by Dijkstra. • Give a higher-level test and set semantic • Two operations P and V. • P(semaphore) : if > 0, decrement semaphore, otherwise, wait • V(semaphore): increment semaphore by one • Semaphore initialized > 0 • Provides the functionality needed to implement mutual exclusion • Standard OS construct • semget(), semctl(), semop() system calls

More Mutual Exclusion • Monitors • Higher-level than Semaphores making them less prone to error • To gain access to shared resource, programs must always go through the monitor. • Condition variables • Gain access to a resource, when a particular condition occurs (more later).

Threads • For SMP, could always use heavyweight processes • Performance penalties • More burden on the programmer to manage shared structures (“pointer hell”) • Threads allow concurrency within a single process • Lighter-weight access

Processes and Threads • Process includes address space. • Thread is program counter and stack pointer. • Process may have many threads. • All the threads share the same address space. • Processes are heavyweight, threads are lightweight. • Processes/threads need not map one-to-one onto processors.

heap stack 1 SP1 data stack 2 SP2 stack 3 SP3 PC1 function f PC2 code function g PC3 Three Threads Within a Process

pool of processors pool of threads Thread Execution Model • Each thread of control can be scheduled by the OS when it is in a runnable state. • Threads within one process can run concurrently • mutual exclustion is very important

Thread Execution Model: Key Points • Pool of processors, pool of threads. • Threads are peers. • Dynamic thread creation. • Can support many more threads than processors. • Threads dynamically switch between processors. • Threads share access to memory. • Synchronization needed between threads.

Why Use Threads? • Representing Concurrent Entities • Concurrency is part of the problem specification. • Examples: systems programming and user interfaces. • Single or multiple processors. • This kind of multithreaded programming is difficult. • Multiprocessing for Performance • Concurrency is under programmer’s control. • Programs could be written sequentially. • This kind of multithreaded programming should be easier.

Commercial Thread Libraries • Win32 threads (Windows NT and Windows 95). • Pthreads (POSIX Thread Interface).(SGI IRIX, Sun Solaris, HP-UX, IBM AIX, Linux, etc.). • Solaris threads (SunOS 5.x). • All designed primarily for systems programming.

Example: Pthreads • POSIX Threads – available on many platforms • Thread Management:pthread_create(), pthread_join(), pthread_exit(), pthread_kill(),pthread_cancel() • Mutexes:pthread_mutex_create(), pthread_mutex_init(), pthread_mutex_lock(), pthread_mutex_unlock(), pthread_mutux_trylock() • Events:pthread_cond_init(), pthread_cond_wait(), pthread_cond_timedwait(), pthread_cond_signal() • Scheduling:pthread_setschedparam(), pthread_attr_setschedpolicy()

Condition Variables • Would like to be “woken up” when a particular condition occurs • Calling pthread_cond_wait(mutex) releases exclusive access to a mutex. Thread sleeps. • When condition is signalled, thread wakes up and given access back to the mutex

Conditional Waiting action() { lock(); while (x != 0) wait (s); unlock(); } counter() { lock(); x--; if (x==0) signal(s); unlock(); } Both must occur before wait() returns

A Simple Example: Array Summation • int array_sum(int n, int data[]){ int mid; int low_sum, high_sum; mid = n/2; low_sum = 0; high_sum = 0; #pragma multithreadable { for (int i = 0; i < mid; i++) low_sum = low_sum + data[i]; for (int j = mid; j < n; j++) high_sum = high_sum + data[j]; } return low_sum + high_sum;}

typedef struct { int n, *data, mid; int *high_sum, *low_sum;} args_block; • void sum_0(args_block *args){ for (int i = 0; i < args->mid; i++) *args->low_sum = *args->low_sum + args->data[i];} • void sum_1(args_block *args){ for (int j = args->mid; j < args->n; j++) *args->high_sum = *args->high_sum + args->data[j];} • int array_sum(int n, int data[]){ int mid; int low_sum, high_sum; args_block args; pthread_t threads[2]; mid = n/2; args.n = n; args.data = data; args.mid = mid; args.low_sum = &low_sum; args.high_sum = &high_sum; • pthread_create(&thread[0], NULL, (void *) sum_0, (void *) &args); pthread_create(&thread[1], NULL, (void *) sum_1, (void *) &args); • for (i = 0; i < 2; i++) /* wait for threads to complete */ • pthread_join(&thread[i], &retval); return low_sum + high_sum;} attributes Routine to execute Thread args

Commodity Multithreaded Applications • Example Problems: Spreadsheets, CAD/CAM, simulation, video/photo editing and production, games, voice/handwriting recognition, real-time 3D rendering, job scheduling, etc. etc. • Need to run as fast as sequential on one processor. • Need to run significantly faster on multiprocessors. • No recompilation, no relinking, no reconfiguration. • Need to adapt dynamically to changing resources. • Need to be reliable and timely.

Last Thoughts on Threading • Threads provide a way to expose parallelism within a task. • Advantages • Straightforward parallelism • Common construction (Java, Win32, Pthreads) • Shared variables eliminates copying • Disadvantages • Mutual exclusion hard to think about • Not scalable to outside of a single SMP • (Active research to eliminate this)

An Aside: Automatic Parallelization ? • Write a sequential program. • Compiler transforms sequential program into efficient parallel (multithreaded) program • A very very very very very very very difficult problem. • Decades of work on this problem. • Some success with some regular scientific programs. • Not a general solution (and probably never will be). • Not applicable to large, irregular, dynamic programs. • Compilers must overuse locking to insure correctness • Compilers need help determining what code blocks can operate independently  OpenMP directives

CSE 160 - Lecture 15

CSE 160 - Lecture 15

Presentation Transcript

CS 160: Lecture 15

CSE 160 – Lecture 2

CSE 8A Lecture 15

CSE 341 Lecture 15

CSE 403 Lecture 15

CSE 143 Lecture 15

CSE 160 – Lecture 16

CSE 160 – Lecture 10

CSE 403 Lecture 15

CSE 160 – Lecture 9

CS 160: Lecture 15

CSE 160 – Lecture 2

CSE 160 – Lecture 16

CSE 143 Lecture 15

CSE 524: Lecture 15

CSE 143 Lecture 15

CSE 524: Lecture 15