Chapter 4: Multithreaded Programming

Chapter 4: Multithreaded Programming Lecture 3 10.7.2008 Hao-Hua Chu

I/O Brush (MIT Media Lab)

Outline • Review what a process is. • What is a thread and why? • Multithreading models • Thread libraries • Threading issues

Process overview • What is a process? • A program in execution on the computer • What is multi-programming? • What resources does OS allocate to run a process? • CPU, Memory, file (I/O), network (I/O) • What is a process made up of in memory? • What is the cost in creating & running a process? • What is the cost in context-switching between processes? • Heavy vs. light load

Consider a Web Browser • How to load & render these 9+ images files quickly? • Issue parallel HTTP requests, one per process (for example only, not in real implementation). • What is the speedup? • What is the cost using many processes? • How to lower this cost?

Single vs. Multithreaded Processes

Benefit-Cost Analysis(multithreaded vs. single-threaded processes) • Benefits • Responsiveness • Some threads (image loading threads) blocked, other (UI thread) can continue to run. • Other examples? • Resource sharing • Reuse the same address space, data/code segments • Economy • Thread creation 30x faster than process creation • Thread context switch 5x faster than process context switch • Utilization of multiprocessor architecture • One process running multiple threads on multiple CPUs • What are the costs?

Implementing Multithreading • Poor implementation reduces benefit & increases cost  • What are good evaluation metrics for threading implementation? • How to implement multithreading support? • At user-level (via user library): user threads • POSIX Pthreads, Win32 threads, Java threads • At kernal-level: kernel threads • Windows XP, Mac OS X, Solaris, almost all modern OS • Both • What does it mean to implement at the user-level and kernel-level? • Kernel threads are used for VMM, I/O as well as handling syscalls)

Multithreading Models • What are the possible mappings between user threads and kernel threads? • Many-to-one • One-to-one • Many-to-one • How about one-to-many?

Many-to-one Model • All threads from one process map to the same kernel thread • User-level thread library manages context-switching. • Is this implementation good (efficient vs. concurrency)? • Yes, but … • One user thread issues a blocking syscall? • Multi-processor system? • How to improve concurrency?

Many-to-one Model • Each user thread maps to one kernel thread • Is this implementation good (concurrency vs. efficiency)? • Good concurrency, why? (blocking syscall does not affect other threads) • Expensive, why? (user-thread creation -> kernel-thread creation) • How to have both good concurrency and efficiency?

Many-to-many Model • Many user threads are mapped to a smaller or equal number of kernel threads. • Why is this better than Many-to-one? (concurrency & multi-processor) • Why is this better than one-to-one? (efficiency) • Like one-to-one concurrency? • Two-level model

2-Level Model • Allows a user thread to be bound to kernel thread

Thread Library APIs: POSIX Pthreads

Threading Issues • Semantics of fork() and exec() system calls • Thread cancellation • Signal handling • Thread pools • Thread specific data • Scheduler activations

Semantics of fork() and exec() • Does fork() duplicate only the calling thread or all threads? • The best answer, when you are not sure, is both. • What would you do if exec() is called immediately after fork()?

Thread Cancellation • Terminating a thread before it has finished • Two general approaches: • Asynchronous cancellation terminates the target thread immediately • Deferred cancellation allows the target thread to periodically check if it should be cancelled • Why have both? • Concurrency control … (more about this later)

Signal Handling • Signals are used in UNIX systems to notify a process that a particular event has occurred • A signal handler is used to process signals • Signal is generated by particular event • Signal is delivered to a process • Signal is handled • Two types of signals: asynchronous vs. synchronous • What is the difference? • How to deliver a signal in a multithreaded process? • Which thread? One or all?

Signal Handling • Options: • Deliver the signal to the thread to which the signal applies • Applied to Synchronous signals • Deliver the signal to every thread in the process • Deliver the signal to certain threads in the process • Assign a specific thread to receive all signals for the process

Thread Pools • Create a number of threads in a pool where they await work • Advantages: • Usually slightly faster to service a request with an existing thread than create a new thread • Allows the number of threads in the application(s) to be bound to the size of the pool

Thread Specific Data • Allows each thread to have its own copy of data (not shared!) • Useful when you do not have control over the thread creation process (i.e., when using a thread pool)

Scheduler Activations • Both M:M and Two-level models require communication to maintain the appropriate number of kernel threads allocated to the application • Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library • This communication allows an application to maintain the correct number kernel threads

Summary • Thread overview • Single-thread process vs. multithreaded processes • Multi-threading models • Many-to-one, one-to-one, many-to-many, two-level model • Thread issues • Semantics of fork() and exec() system calls, thread cancellation, signal handling, thread pools, thread specific data, scheduler activations

End of Chapter 3

Chapter 4: Multithreaded Programming