410 likes | 424 Views
Chapter 4:Threads. Book: Operating System Principles , 9 th Edition , Abraham Silberschatz, Peter Baer Galvin, Greg Gagne. Threads. A thread is a basic unit of CPU utilization and it comprises of: Thread ID Program Counter Register set Stack
E N D
Chapter 4:Threads Book: Operating System Principles , 9thEdition , Abraham Silberschatz, Peter Baer Galvin, Greg Gagne
Threads • A thread is a basic unit of CPU utilization and it comprises of: • Thread ID • Program Counter • Register set • Stack • A thread shares with other threads belonging to the same process: • Code Section • Data Section • Operating System Resources (Signals, Open files) • Heavy weight process can perform one task at a time, whereas multithreaded process can perform more than one task at a time
Threads (Contd…) • Many of the applications typically implemented as a separate process with multiple threads of control (e.g. web browser, word processor ) • Multithreaded processes reduce wastage of time of a client due to concurrent processing of several tasks. • Threads play a vital role in remote procedure call by allowing concurrent processing • Operating system kernel are now multithreaded, each perform a specific task
Benefits • Increased responsiveness (i.e. a multithreaded web browser could still allow user interaction in one thread while an image is being loaded in another thread). • Sharing Resources: Threads share memory and resources of a process to which they belong (code sharing allows an application to have several different threads of activity all within the same address space). • Economy: Due to sharing of resources and reduction in time consumption for creating and managing threads (in Solaris 2, creating a process is about 30 times slower than creating a thread, and context switching is about five times slower). • Utilization of multiprocessor architectures increases concurrency and efficiency because of running of each thread in parallel on a different processors.
Multicore Programming • Multithreaded programming provides a mechanism for more efficient use of multiple computing cores and improved concurrency • Concurrency • Parallelism
Programming Challenges • Trend towards multicore system continues to place pressure on system designer and application programmers to make better use of multiple computing cores • Designers of OS must write scheduling algorithms that use multiple processing cores to allow parallel execution • Five areas present challenges in programming for multicore systems • Identifying tasks • Balance • Data Splitting • Data dependency • Testing and debugging
Types of Parallelism • Two types of parallelism • Data Parallelism: distribute subset of the same data across multiple computing cores and perform the same operation on each core • Task Parallelism: distribute tasks across multiple computing cores. Each thread is performing a unique operation. Different threads may be operating on the same data or may be operating on different data
User Threads • User-level threads are supported above the kernel and are implemented by a thread library at the user level. • User-thread library provides support for thread creation & scheduling in user space, and management with no support from the kernel. • User-level threads are generally fast to create and manage. • User-level threads performing a blocking system call will cause the entire process to block if the kernel is single-threaded. • User- level thread libraries include POSIX Pthreads, Mach C-threads and Solaris 2 UI-threads.
Kernel Threads • Kernel threads are supported directly by the Kernel (thread creation, scheduling and management in kernel space). • Kernel threads are generally slower to create and manage than user threads because thread management is done by the OS. • The kernel can schedule another thread in the application for execution on a processor using single-processor or multiprocessor environment if a thread performs a blocking system call. • Kernel threads are supported by Windows 2000, Windows NT, Solaris 2, Tru64 UNIX (Digital UNIX), BeOS and Linux.
Multithreading Models • There exists a relationship between user and kernel threads. • Based on the relationship, the following model exists • Many to One • One to One • Many to Many
Many to One Model • Maps many user level thread to only one kernel thread • Advantage: • It is an efficient model. • Disadvantage: • The entire process will block if a thread makes a blocking system call. • Unable to run multiple threads in parallel on multiprocessors. • Example: Green threads (a thread library) in Solaris 2 uses this model.
One to One Model • Maps each user thread to kernel thread • Advantage: • It provides more concurrency by allowing another thread to run when a thread makes a blocking system call. • Allows multiple threads to run in parallel on multiprocessors. • Disadvantage: • Creating a user thread requires creating the corresponding kernel thread burdening the performance of an application. • Number of threads supported by the system are restricted. • Examples: Windows NT, Windows 2000, OS/2
Many to Many Model • Multiplexes many user level threads to a smaller or equal number of kernel threads • The number of kernel threads may be specific to either a particular application or a particular machine. • Developers can create as many user threads as necessary and the corresponding kernel threads can run in parallel on multiprocessors. • The kernel can schedule another thread for execution in case a thread performs a blocking system call. • Examples: Solaris 2, IRIX, HP-UX, and Tru 64 UNIX
Two-level Model • Similar to Many to Many model, except that it allows a user thread to be bound to kernel thread • Examples • IRIX • HP-UX • Tru64 UNIX • Solaris 8 and earlier
Thread Library • A thread library provides the programmer an API for creating and managing threads • Two primary ways of implementing a thread library • Provide a library entirely in the user space with no kernel support. Code and data structure for the library exists in the user space. Invoking a function in the library results in the local function call. • Implemented a kernel level library supported directly by the operating system. Code and data structure for the library exists in the kernel space. Invoking a function in the library results in system call to the kernel.
Thread Library • Three main thread libraries that are in use today are: • POXIS Pthreads (either as user or kernel level library) • Win32 thread library (kernel level library) • Java threads (typically implemented through Win 32 API)
Pthreads • A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization • API specifies behavior of the thread library, implementation is up to development of the library • Common in UNIX operating systems (Solaris, Linux, Mac OS X) • Multithreaded C Program can be implemented by using Pthreads API
Win32 Threads • The technique used for creating thread in win32 is similar to the one used in Pthreads
Windows XP Threads • Implements the one-to-one mapping • Each thread contains • A thread id • Register set • Separate user and kernel stacks • Private data storage area • The register set, stacks, and private storage area are known as the context of the threads • The primary data structures of a thread include: • ETHREAD (executive thread block) • KTHREAD (kernel thread block) • TEB (thread environment block)
Java Threads • Java threads are managed by the JVM • API provides a rich set of features for the creation and management of threads • Java threads may be created by: • Extending Thread class • Implementing the Runable interface • Sharing of data occurs by passing reference to the shared object of the appropriate thread
Strategies for Creating Multiple Threads • Asynchronous Threading • Once the parent creates the child thread, the parent resumes its execution, so that the parent and child execute concurrently • Threads are independent of each other • Synchronous Threading • When the parent thread creates one or more children, then must wait of all its children to terminate before it resumes
Threading Issues • Semantics of fork() and exec() system calls • Thread cancellation • Signal handling • Thread pools • Thread specific data • Scheduler activations
Fork and Exec System Call • The fork and exec System Calls • Versions of fork system call are:- • If one thread in a program calls fork, the new process duplicates all threads– useful where the separate process does not call exec after forking. • If one thread in a program calls fork, the new process is single-threaded – useful where exec is called immediately after forking. • If a thread invokes the exec system call, the program specified in the parameter to exec will replace the entire process (including all threads and LWPs).
Cancellation • Thread cancellation is a task of terminating a thread before it has completed. (e.g. many threads searching from a database, cancel a web page) • A thread that is to be cancelled is often referred to as the target thread. • Cancellation of a target thread may occur in two different scenarios: • Asynchronous cancellation: One thread immediately terminates the target thread. • Deferred cancellation: The target thread can periodically check if it should terminate, allowing the target thread an opportunity to terminate itself in an orderly fashion.
Cancellation • Canceling a thread asynchronously may not free a necessary system-wide resource because OS often will not reclaim all resources of the cancelled thread (many OS use this mechanism). • Deferred cancellation allows a thread to check if it should be cancelled at a point when it can safely be cancelled (Pthreads refer to such points as cancellation points).
Signal Handling • A signal (received either synchronously or asynchronously) is used in UNIX systems to notify a process that a particular event has occurred. • All signals follow the same pattern: • A signal is generated by the occurrence of a particular event. • A generated signal is delivered to a process. • Once delivered, the signal must be handled. • Synchronous signals (an illegal memory access or division by zero) are delivered to the same process that performed the operation causing the signal (an event internal to a running process).
Signal Handling • Asynchronous signals (terminating a process with specific keystrokes or having a timer expire) are delivered to another process (an event external to a running process). • Every signal may be handled by one of two possible handlers: • A default signal handler which is run by the kernel when handling the signal. • A user-defined signal handler calls the user-defined function to handle the signal. • In the single-threaded programs, signals are always delivered to a process (a straightforward method).
Signal Handling • Delivering signals in multi-threaded programs is more complicated as a process may have several threads. Following options exist: • Deliver the signal to the thread to which the signal applies (synchronous signals) • Deliver the signal to every thread in the process (asynchronous signals a signal that terminates a process). • Deliver the signal to certain threads in the process (UNIX allows a thread to specify which signals it will accept and which it will block). • Assign a specific thread to receive all signals for the process (Solaris 2– asynchronous signals).
Thread Pools • A multithreaded server (e.g. web server) creates a separate thread to service the request – efficient compared to creating a separate process. • Potential problems of a multithreaded servers are: • The amount of time required to create the thread prior to serving the request. • Unlimited threads concurrently active in the system could exhaust system resources (i.e. CPU time or memory). • Thread pools are used to resolve these issues. A number of threads are created at process startup and are placed into a pool, a thread is awakened (if available) when a server receives a request, returns the thread after completing its service, and come back in pool and waits for more work
Thread Pools • Benefits of thread pools are: • It is usually faster to service a request with an existing thread than waiting to create a thread. • A thread pool limits the number of threads that exist at any one point. This is particularly important on systems that cannot support a large number of concurrent threads. • The number of threads in the pool can be determined by various factors: • Number of CPUs in the system. • The amount of physical memory. • The expected number of concurrent client requests.
Thread Pools • Dynamic adjustment of number of threads in the pool is carried out in the more sophisticated thread-pool architectures according to usage patterns (smaller pool when the load on the system is low).
Thread-Specific Data • Thread belonging to the process share the data of the process • Thread-Specific data is required in circumstances where each thread might need its own copy of data for processing ( for example, each transaction in a separate thread in a transaction-processing system). • Examples: Win32, Pthreads, and Java.
Scheduler Activations • Both M:M and Two-level models require communication between kernel and thread library to maintain the appropriate number of kernel threads allocated to the application • Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library • This communication allows an application to maintain the correct number kernel threads
Scheduler Activations • Many systems implementing either the many-to-many or two-level model place an intermediate data structure between the user and kernel threads, typically known as a lightweight process, or LWP • To the user-thread library, the LWP appears to be a virtual processor on which the application can schedule a user thread to run. • Each LWP is attached to a kernel thread, and it is kernel threads that the operating system schedules to run on physical processors. • If a kernel thread blocks (such as while waiting for an I/O operation to complete), the LWP blocks as well. Up the, chain, the user-level thread attached to the LWP also blocks.
Scheduler Activations • An application may require any number of LWPs to run efficiently. • A CPU-bound application running on a single processor. In this scenario, only one thread can run at once, so one LWP is sufficient. • An application that is I/O intensive may require multiple LWPs to execute, however. Typically, an LWP is required for each concurrent blocking system call. Suppose, for example, that five different file-read requests occur simultaneously. Five LWPs are needed, because all could be waiting for I/O completion in the kernel. If a process has only four LWPs, then the fifth request must wait for one of the LWPs to return from the kernel.