Processes and Threads

Processes and Threads Operating SystemsCS 550Spring 2014Kenneth Chiu(Chapter 2 of Tanenbaum)

The Process Model • All running software (at user-level) is organized into processes. • A process is a program in execution. • I.e., an instance of a running program. • Best way to think of a process is as an atomic unit of running resource ownership. • In other words, a file is opened by a process. • A virtual memory segment belongs to a process. • Etc. What is the difference between a process and a program? • A program is the set of instructions, executable, etc. • A process is a running program. There can be two instances of a program, which results in two processes.

Multiprogramming • Multiprogramming is the ability to rapidly switch between processes (assuming for now single-threaded processes and single-core/CPU).

Suppose you are writing a program, maybe a game. You need to pause for 1/10 of a second. Can you write a loop like this to create that pause? for (int i = 0; i < 1000000; i++); • Programs can’t use CPU loops for timing: • for (int i = 0; i < 1000000; i++); • Doesn’t work, since the process may get switched out • CPU speed is also a problem especially because it may change dynamically due to power-management. • No assumptions can be made about scheduling.

Process Creation • When do new processes need to be created? • Four principle events which cause process creation: • System initialization. • Many of these are daemons (services). • Execution of a process creation system call by a running process. • A user request to create a new process. • Initiation of a batch job. • In UNIX, all processes created as the result of a fork() call. • The child process is a duplicate. • To start a new program, exec() must be called. This overlays the old program with a new one (but it’s the same process ID, etc.) Isn’t this inefficient, since the process is first copied, but then immediately overlaid? • To avoid inefficiencies, copy-on-write techniques are used. • Linux: clone() • What is duplicated can be selected. • Windows: CreateProcess() • Address space not duplicated.

Process Termination • One of the following conditions: • Normal exit (voluntary). • Error exit (voluntary). • Fatal error (involuntary). • Segfault. • Killed by another process (involuntary). • Signal 9. • Upon exit: • Output data from child to parent (via wait). • Process’ resources are deallocated by operating system.

Process Hierarchies • Each process in UNIX has exactly one parent. • A process may have multiple children. • A subtree in the hierarchy is a process group. • Some signals propagate to the whole group. What happens when a child’s parent exits? • If a child’s parent exits, its parent then becomes the init process, which is the parent of everything.

Process States Why might a process not be running? What states can it be in? • A process may not be running for two very different reason. • A process may be not running because it is waiting for input. • A process may also be not running because the CPU is currently running another process, even though it is ready to run. • These two conditions are completely different, and must be distinguished.

A process may be in three states (simplified): • Running (actually using the CPU at that instant). • Ready (runnable; temporarily stopped to let another process run). • Blocked (unable to run until some external event occurs). • The first two states are similar, in the third case, the process couldn’t run even if hardware resources were available to run it. What causes the transitions below? • Four transitions are possible:

Processes are controlled by scheduler. • Interrupts of course will interact with scheduler, but hidden from processes.

Implementation

Interrupt are serviced in a such that it is transparent to the process.

Modeling Multiprogramming • What’s the goal of multiprogramming? • Without multiprogramming what does the CPU do when the process is doing I/O? • Multiprogramming improves response and utilization. • How many processes should be run at once? • 2; 5; 10; 100; 1000; 1,000,000? • Assume that processes compute 20% of the time. • If two processes, what’s CPU utilization? • 3, 4, 5, 6, 1000?

Does multiprogramming ever make an application run faster? Is the utilization/throughput effect more or less pronounced with increasing I/O wait time? • Multiprogramming can result in better CPU utilization if processes spend a significant amount of time blocked. • Multiprogramming will allow other programs to run. • If there are N processes, each spends fraction p of the time blocked, what is the probability that the CPU has nothing to do (i.e., idle)? Degree of multiprogramming

threads overview

Motivation • Traditionally, a process is one address space and one thread of control. Why would you ever want multiple threads in one process? • In many applications, multiple things are going at the same time. • Receiving from a server. • Writing to the screen. • Processing user input. • Doing a long running computation. • If you have one thread of control, it’s like one person trying to keep track of everything. • If you use multiple threads, it’s like having multiple people to help you keep track of things. • It makes your code simpler. • Can also have some performance benefit, especially on multicore. Is there a performance benefit when not using multicore? • Can have some performance benefit by helping to overlap computation with I/O, similarly to multiprogramming.

Word processing example • Suppose you have a 100 page document. You type one character on the first page. What kind of computation has to happen? • That causes all pages to be reformatted. • Other tasks that should happen concurrently are: • Keyboard input • Reformatting • Autosave

Thread Model • Threads are sometimes called lightweight processes (LWP). Process 2 Process 3 Process 1

Threads share (almost) all resources except those directly related to execution. What states can threads be in? • Threads share the same states that processes do.

Each thread has its own stack.

POSIX Threads (Pthreads) • Pthreads is a standard for threads, mostly supported on UNIX systems. • It is a C standard, but you can use it from C++, as long as you are careful.

Example from book. [Show tanenbaum_pthreads_example.] • Is it correct? intmain() { pthread_t threads[N_THREADS]; for (inti = 0; i < N_THREADS; i++) { int ec = pthread_create(&threads[i], 0,thread_func, (void *) i); assert(ec == 0); } exit(0); } #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <assert.h> #define N_THREADS 10 extern "C" void * thread_func(void *vp) { int id = (long) vp; printf("This is thread %d...\n", id); pthread_exit(0); }

Motivation, Revisited • Web server: • Dispatcher reads requests, hands off. • Worker thread checks cache, if not there, then starts a read.

Code for dispatcher and worker. • buf holds work request, page holds page.

If not multi-threaded, then the single thread reads the request, checks the cache, and reads the disk if needed. What’s the CPU doing while the thread is reading the page from disk? • CPU is idle while blocked. • Third approach is to use select(), and multiplex using FSM. [Show server_styles.]

Kernel Threads • Threads can be implemented entirely in the kernel. • Kernel is responsible for scheduling and context switching. How does a thread context switch occur? How does a process context switch occur? • Context switch occurs exactly as with processes. A thread makes a system call, traps into the kernel, the scheduler eventually runs. If a process forks(), what happens to the threads? When a process gets a signal, which thread executes it? • There are tricky issues such dealing with fork() and signals. No single right answer.

Threads in User Space • It is possible to implement threads completely in user-space. This means that the kernel doesn’t know about threads at all. It thinks the process has just one thread. • In a user-level threads system, how does a thread context switch occur? • Thread context switches occur only when the user-level scheduler is run. • Are fork() or signals an issue?

What happens when a user-space thread blocks on a system call? • Blocking system calls must be handled in some way. • One way is to check before calling to see whether or not it would block. If it would, then do thread context switch. What happens when a user-level thread page faults? Kernel-level? • Page faults cannot be handled elegantly. What happens if a thread does a very long computation, without calling any other functions? • ULT are not that popular: argument is that if you have to make a select call to prevent blocking, then might as well just do a context switch anyway.

Hybrids • Kernel threads: 1-to-1 between execution unit in the process and an execution unit in the OS. • User-level threads: N-to-1 between an execution unit in the process and anexecutionunit in the OS. • Hybrid threads:N-to-M

Pop-Up Threads • Threads are commonly used in servers. • A request could be: • Handed-off to an existing thread (thread pool approach). • Given to a newly created thread. • Which wayis better?

thread management

What thread management functions are needed? • Creation • Termination • Self • Cancellation • Stack management • Priority management

Creation • Threads are created via a library call that specifies various parameters for creating the thread. • Stack size. • Function to run. • Initial data. • The library call may invoke system calls.

POSIX: • pthread_create(pthread_t *thread,constpthread_attr_t *attr, void *(*start_routine)(void*), void *arg); • Example (C++): • #include <stdint.h>extern "C" void *run(void *data) { … }// Use default attributes, pass 1234 as void *.intrv = pthread_create(&tid, nullptr, run, (void *) intptr_t(1234));assert(rv == 0);

Thread Attributes • Init/destroy: • pthread_attr_init(pthread_attr_t*attr); • pthread_attr_destroy(pthread_attr_t*attr); • Stack characteristics: • pthread_attr_setstacksize(&attr, size_t); • pthread_attr_setstackaddr(&attr, void *stack); • User level or kernel threads: • pthread_attr_setscope(&attr,PTHREAD_SCOPE_SYSTEM); • pthread_attr_setscope(&attr, PTHREAD_SCOPE_PROCESS); • Joinable or not: • pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); • pthread_detach(pthread_ttid);

Identity • Programs need some way to uniquely name and identify threads. • What are some qualities of an ID that are useful? What do you want to do with an ID? • Pthreads: pthread_tis thread id. • Can get own, with pthread_self(). • What can you do with a pthread_t? • Compare for equality: pthread_equal(). • Let’s say that you want to associate some information with each thread, and want to be able to look it up. Can you do this? • std::map<pthread_t, InfoObj *> thread_map; • No, cannot. Solution is to create own thread ID that you can order. • std::map<MyThreadID, InfoObj *> thread_map; • How do you get back the original pthread_t?

Example: • inttid = allocate_new_thread_id();pthread_create(&ptid, nullptr, run, (void *) intptr_t(tid));thread_map.insert(make_pair(tid, ptid));

Exiting • POSIX threads exit by: • Returning from their function, • void *run(void *) { ...; return 0; } • Or calling pthread_exit(). • What happens if main() returns? • What happens if it calls pthread_exit()? • main() • returns: whole process exits. • calls pthread_exit(): only main thread exits. [Show pthread_test.] • Thread function has a return value. • void *vp; pthread_join(tid, &vp); • If threads are not detachable, but never joined, this will cause a memory leak.

Example • [Show tanenbaum_pthreads_example/.]

synchronization

Processes/threads often need to communicate. • Main issues are how to actually convey the information and how to synchronize. • Synchronize means how to do things in the right order at the right time without interfering. • When talking about IPC, often we’ll use the term “process” to mean a thread or a process. (It’s overloaded.)

Sum Example • Two threads incrementing an integer. • [Show sum_rc/.]

Shared Memory Consistency • [Show shared_memory-mmap.]

Race Conditions • sum++ could be implemented as • register1 = sumregister1 = register1 + 1sum = register1 • Here’s how the threads might execute with “sum = 0” initially: • T0: Thread 1 executes register1 = sum {register1 = 0}T1: Thread 1 executes register1 = register1 + 1 {register1 = 1} T4: Thread 1 executes sum = register1 {sum = 1} T5: Thread 2 executes register1 = sum {register1 = 1} T6: Thread 2 executes register1 = register1 + 1 {register1 = 2} T7: Thread 2 executes sum = register1 {sum = 2} • Consider this execution interleaving with “sum = 0” initially: • T0: Thread 1 executes register1 = sum {register1 = 0}T1: Thread 1 executes register1 = register1 + 1 {register1 = 1} T2: Thread 2 executes register1 = sum {register1 = 0} T3: Thread 2 executes register1 = register1 + 1 {register1 = 1} T4: Thread 1 executes sum = register1 {sum = 1} T5: Thread 2 executes sum = register1 {sum = 1}

A race condition happens when two or more threads/processes interfere with each other. • It’s called a race condition because it’s a little bit like two threads “racing” to reach a certain point in the execution. • Race conditions are caused by invalid assumptions of atomicity. How do you fix race conditions?

Atomicity • What is the fundamental cause of race conditions? • Caused by things that need to be atomic not really being atomic. • What does “atomic” mean? • How can we “fake” atomicity? • A violation of atomicity occurs only when the intermediate steps are “observed”.

In this sequence of instructions, does B “observe” that the increment is not atomic? • What about in this sequence?

“Observing” a violation of atomicity amounts to reading or writing to data that was in the middle of being updated. • Can we prevent that from ever happening?

Critical Regions/Sections • Critical region: Region of execution where there is a temporary inconsistency. • Something needs to be atomic, but can’t be made truly atomic. • Need to hide it. • How? • Make other processes wait.

What’s often left out of textbooks is that: • Critical regions are NOT sections of code. • They are regions of EXECUTION. • There are multiple “instances” of a critical region in a real program. • There are multiple critical regions (unrelated). • There are often multiple ways to design your critical regions, all correct.

Processes and Threads