Advanced Operating Systems

Advanced Operating Systems Lecture 4: Process University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Distributed Operating Systems

Topic • How OS handle processes • References • “Cooperative Task Management Without Manual Stack Management”, by Atul Adya, et.al. • “Capriccio: Scalable Threads for Internet Services”, by Ron Von Behrn, et. al. • “The Performance Implication of Thread Management Alternative for Shared-Memory Multiprocessors”, Thomas E. Anderson, et.al. Distributed Operating Systems

Outline • Introduction • Processes • Process operations • Threads • What are bad about Threads? • Different Thread issues. Distributed Operating Systems

Processes • Users want to run programs on computers • OS should provide facilities! • OS considers a running program as a Process? • It is an abstraction • OS needs mechanisms to start, manipulate, suspend, scheduled and terminated processes • Process Types: • OS processes executing system code • User processes executing user code • Processes are executed concurrently with CPU multiplexed among them Distributed Operating Systems

So What Is A Process? • It’s one executing instance of a “program” • It’s separate from other instances • It can start (“launch”) other processes • It can be launched by them Distributed Operating Systems

Processes Issues • How to create, suspend and terminate processes? • What information we need to keep for each process? • How to select a process to run? (multiprogramming) • How switch among processes? • How isolate, protect process from each others? Distributed Operating Systems

Process Creation • By system or user through command line see clone(), fork(), vfork() int parentpid; int childpid; if ((childpid = fork()) == -1) { perror(can’t create a new process); exit(1); } else if (childpid == 0) {/* child process executes */ printf(“child: childpid = %d, parentpid = %d \n”, getpid(), getppid()); exit(0); } else { /*parent process executes */ printf(“parent: childpid = %d, parentpid = %d \n”, childpid, getpid()); exit(0); } Distributed Operating Systems

Process Creation (What we need?) • We need resources such as CPU, memory files, I/O devices • Get resources from a parent • Prevents many processes from overloading system • When creating a new process , execution possibilities are • Parent continues concurrently with child • Parent waits until child has terminated • When creating a new process, address space possibilities are: • Child process is duplicate of parent process • Child process had a program loaded into it • What other information OS needs? (comes later) Distributed Operating Systems

Process Termination • Normal exit, end of program (voluntary) • Ask OS to delete it, deallocate resources • Error exit (voluntary) (exit(2)) or Fatal error (involuntary) • Killed by another process (involuntary) • Child process may return output to parent process, and all child’s resources are de-allocated. • Other termination possibilities • Abort by parent process invoked • Child has exceeded its usage of some resources • Task assigned to child is no longer required • Parent is exiting and OS does not allow child to continue without parent Distributed Operating Systems

What information OS need? • Memory Management Information • base/limit information; • Process State • new, ready, running, waiting, halted; • Program Counter • the address of the next instruction to be executed for this process; • CPU Registers • index registers, stack pointers, general purpose registers; • CPU Scheduling Information • process priority and pointer; • Accounting Information • time limits, process number; owner • I/O Status Information • list of I/O devices allocated to the process; Distributed Operating Systems

Process States • Possible process states • Running (occupy CPU) • Blocked • Ready (does not occupy CPU) • Other states: suspended, terminated • Transitions between states • Question: in a single processor machine, how many processes can be in running state? Distributed Operating Systems

Process Hierarchies • Parent creates a child process, a child process can create its own processes • Forms a hierarchy • UNIX calls this a "process group" • Windows has no concept of process hierarchy • all processes are created equal Distributed Operating Systems

The Process Model • Multiprogramming of four programs • Conceptual model of 4 independent, sequential processes • Only one program active at any instant • Real life analogy? • A daycare teacher of 4 infants Distributed Operating Systems

Address Space • Program segments • Text • Data • Stack • Heap • Lots of flexibility • Allows stack growth • Allows heap growth • No predetermined division 0xffff…. Kernel space Stack Heap Code & Data 0x0000… Distributed Operating Systems

Process Control Block (PCB) Fields of a process table entry Distributed Operating Systems

Process Scheduling • Objective of multiprogramming – maximal CPU utilization, i.e., have always a process running • Objective of time-sharing – switch CPU among processes frequently enough so that users can interact with a program which is running • Need Context Switching Distributed Operating Systems

Context Switch • Switch CPU from one process to another • Performed by scheduler • save PCB state of the old process; • load PCB state of the new process; • Flush memory cache; • Change memory mapping (TLB); • Context switch is expensive(1-1000 microseconds) • No useful work is done (pure overhead) • Can become a bottleneck • Real life analogy? • Need hardware support Distributed Operating Systems

Interrupt Processing • Illusion of multiple sequential processes with one CPU and many I/O devices maintained? • Each I/O device class is associated with location, called interrupt vector which includes • Address of interrupt service procedure • When interrupt occurs: • Save registers into process table entry for the current process (assembly) • Info pushed onto the stack by the interrupt is removed and the stack pointer is set to point to a temporary stack used by process handler (assembly) • Call interrupt service (e.g., to read and buffer input), process is done (C-language) • Scheduler decides which other process to run next (C-language) • Start to run assembly code to load registers, etc. (assembly) Distributed Operating Systems

Process Descriptor • Process – dynamic, program in motion • Kernel data structures to maintain "state" • Descriptor, PCB (control block), task_struct • Larger than you think! (about 1K) • Complex struct with pointers to others • Type of info in task_struct • Registers, state, id, priorities, locks, files, signals, memory maps, locks, queues, list pointers, … • Some details • Address of first few fields hard coded in asm • Careful attention to cache line layout Distributed Operating Systems

Process State • Traditional (textbook) view • Blocked, runnable, running • Also initializing, terminating • UNIX adds "stopped" (signals, ptrace()) • Linux (TASK_whatever) • Running, runnable (RUNNING) • Blocked (INTERRUPTIBLE, UNINTERRUPTIBLE) • Interruptible – signals bring you out of syscall block (EINTR) • Terminating (ZOMBIE) • Dead but still around – "living dead" processes • Stopped (STOPPED) Distributed Operating Systems

Process Identity • Users: pid; Kernel: address of descriptor • Pids dynamically allocated, reused • 16 bits – 32767, avoid immediate reuse • Pid to address hash • 2.2: static task_array • Statically limited # tasks • This limitation removed in 2.4 • current->pid (macro) Distributed Operating Systems

Descriptor Storage/Allocation • Descriptors stored in kernel data segment • Each process gets a 2 page (8K) "kernel stack" used while in the kernel (security) • task_struct stored here; rest for stack • Easy to derive descriptor from esp (stack ptr) • Implemented as union task_union { } • Small (16) cache of free task_unions • free_task_struct(), alloc_task_struct() Distributed Operating Systems

Descriptor Lists, Hashes • Process list • init_task, prev_task, next_task • for_each_task(p) iterator (macro) • Runnable processes: runqueue • init_task, prev_run, next_run, nr_running • wake_up_process() • Calls schedule() if "preemption" is necessary • Pid to descriptor hash: pidhash • hash_pid(), unhash_pid() • find_hash_by_pid() Distributed Operating Systems

Wait Queues • Blocking implementation • Change state to TASK_(UN)INTERRUPTIBLE • Add node to wait queue • All processes waiting for specific "event" • Usually just one element • Used for timing, synch, device i/o, etc. • Structure is a bit optimized • struct wait_queue usually allocated on kernel stack Distributed Operating Systems

sleep_on(), wake_up() • sleep_on(), sleep_on_interruptible() • See code on LXR • wake_up(), wake_up_interruptible() • See code on LXR • Process can wakeup with event not true • If multiple waiters, another may have resource • Always check availability after wakeup • Maybe wakeup was in response to signal • 2.4: wake_one() • Avoids "thundering herd" problem • A lot of waiting processes wake up, fight over resource; most then go back to sleep (losers) • Bad for performance; very bad for bus, cache on SMP machine Distributed Operating Systems

Process Limits • Optional resource limits (accounting) • getrlimit(), setrlimit() (user control) • Root can establish rlim_min, rlim_max • Usually RLIMIT_INFINITY • Resources (RLIMIT_whatever) • CPU, FSIZE (file size), DATA (heap), STACK, • CORE, RSS (frames), NPROC (# processes), • NOFILE (# open files), MEMLOCK, AS Distributed Operating Systems

Process Switching - Context • Hardware context • Registers (including page table register) • Hardware support but Linux uses software • About the same speed currently • Software might be optimized more • Better control over validity checking • prev, next task_struct pointers • Linux TSS (thread_struct) • Base registers, floating-point, debug, etc. • I/O permissions bitmap • Intel feature to allow userland access to i/o ports! • ioperm(), iopl() (Intel-specific) Distributed Operating Systems

Process Switching – switch_to() • Invoked by schedule() • Very Intel-specific (mostly assembly code) • GCC magic makes for tough reading • Some highlights • Save basic registers • Switch to kernel stack of next • Save fp registers if necessary • Unlock TSS • Load ldtr, cr3 (paging) • Load debug registers (breakpoints) • Return Distributed Operating Systems

Process Switching – FP Registers • This is pretty weird • Pentium – on chip FPU • Backwards compatibility, ESCAPE prefix • Not saved by default • MMX Instructions use FPU • FP registers • Saved "on demand", reload "when needed" (lazily) • TS Flag set on context switch • FP instructions cause exception (device unavailable) • Kernel intervenes by loading FP regs, clearing TS • unlazy_fpu(), math_state_retstore() Distributed Operating Systems

What are wrong with Process? • Processes do not share resources very well, therefore context switching cost is very high. • Process creation and deletion are expensive. • Context switching is a real bottleneck in realtime and interactive systems. • Solutions? • Theard!. • The idea: Do not change the address space only the stack and control block. • Extensive sharing makes CPU switching among peer threads and creation of threads inexpensive compared to processes • Thread context switch still requires • Register set switch • But no memory management related work Distributed Operating Systems

Threads • ‘threads’ share some of the resources • Thread is a light-weighted process and it is the basic unit of CPU utilization. • Thread comprises • Thread ID • Program counter • Register set • Stack space • Thread shares • Code section • Data section • OS resources such as open files, signals belonging to the task Distributed Operating Systems

Threads: Lightweight Processes (a) Three processes each with one thread (b) One process with three threads execution Environment (resource) Distributed Operating Systems

Thread Model • Threads in the same process share resources • Each thread execute separately Distributed Operating Systems

Thread Model: Stack Distributed Operating Systems

Thread Model: State • Threads states are Ready, Blocked, Running and Terminated • Threads share CPU and on single processor machine only one thread can run at a time • Thread management can create child threads which can block waiting for a system call to be completed • No protection among threads!! Distributed Operating Systems

A example program #include ``csapp.h'' void *thread(void *vargp); int main() { phtread_t tid; // stores the new thread ID Pthread_create(&tid, NULL, thread, NULL); //create a new thread Pthread_join(tid, NULL); //main thread waits for the other thread to terminate exit(0); /* main thread exits */ } void *thread(void *vargp) /*thread routing*/ { printf(``Hello, world! \n''); return NULL; } Distributed Operating Systems

Thread Usage: Web Server Distributed Operating Systems

Web Server • Rough outline of code for previous slide (a) Dispatcher thread (b) Worker thread Distributed Operating Systems

Benefits of Threads • Responsiveness • Multi-threading allows applications to run even if part of it is blocked • Resource sharing • Sharing of memory, files and other resources of the process to which the threads belong • Economy • Much more costly and time consuming to create and manage processes than threads • Utilization of multiprocessor architectures • Each thread can run in parallel on a different processor Distributed Operating Systems

Implementing Threads in User Space (old Linux) A user-level threads package Distributed Operating Systems

User-level Threads • Advantages • Fast Context Switching: • User level threads are implemented using user level thread libraries, rather than system calls, hence no call to OS and no interrupts to kernel • One key difference with processes: when a thread is finished running for the moment, it can call thread_yield. This instruction (a) saves the thread information in the thread table itself, and (b) calls the thread scheduler to pick another thread to run. • The procedure that saves the local thread state and the scheduler are local procedures, hence no trap to kernel, no context switch, no memory switch, and this makes the thread scheduling very fast. • Customized Scheduling Distributed Operating Systems

User level Threads • Disadvantages • Blocking • If kernel is single threaded, then any user-level thread can block the entire task executing a single system call • No Protection • There is no protection between threads, since the threads share memory space Distributed Operating Systems

Implementing Threads in the Kernel (Windows 2000/XP) A threads package managed by the kernel Distributed Operating Systems

Hybrid Implementations (Solaris) Multiplexing user-level threads onto kernel- level threads Distributed Operating Systems

Kernel Threads (Linux) • Kernel threads differ from regular processes: • Each kernel thread executes a single specific kernel C function • Regular process executes kernel function only through system calls • Kernel threads run only in Kernel Mode • Regular processes run alternatively in kernel mode and user mode • Kernel threads use smaller linear address space than regular processes Distributed Operating Systems

Multi-threading Models • Many-to-One Model – many user threads are mapped to one kernel thread • Advantage: • thread management is done in user space, so it is efficient • Disadvantage: • Entire process will block if a thread makes a blocking call to the kernel • Because only one thread can access kernel at a time, no parallelism on multiprocessors is possible • One-to-One Model – one user thread maps to kernel thread • Advantage: • more concurrency than in many-to-one model • Multiple threads can run in parallel on multi-processors • Disadvantage: • Creating a user thread requires creating the corresponding kernel thread. There is an overhead related with creating kernel thread which can be burden on the performance. Distributed Operating Systems

Multi-threading Models • Many-to-Many Model – many user threads are multiplexed onto a smaller or equal set of kernel threads. (Sparc machine) • Advantage: • Application can create as many user threads as wanted • Kernel threads run in parallel on multiprocessors • When a thread blocks, another thread can still run Distributed Operating Systems

Cons and Pros of Threads • Grew up in OS world (processes) and evolved into user-level tool. Proposed as solution for a variety of problems. • Every programmer should be a threads programmer? • Challenge: Making Single-Threaded Code Multithreaded, Global variables! • Problem: threads are very hard to program. • Alternative: events. • Claims: • For most purposes proposed for threads, events are better. • Threads should be used only when true CPU concurrency is needed. Distributed Operating Systems

What's Wrong With Threads? casual wizards • Too hard for most programmers to use. • Even for experts, development is painful. all programmers Visual Basic programmers C programmers C++ programmers Threads programmers Distributed Operating Systems

Why Threads Are Hard • Synchronization: • Must coordinate access to shared data with locks. • Forget a lock? Corrupted data. • Deadlock: • Circular dependencies among locks. • Each process waits for some other process: system hangs. thread 1 thread 2 lock A lock B Distributed Operating Systems

Advanced Operating Systems