410 likes | 512 Views
Processes. Announcements. All office hours have been finalized. Available on web. CS 4410 Homework available via CMS Due next Tuesday, September 9 th CS 4411 initial design documents due Monday, Sept 8 th Project due following Wednesday, September 17 th Everyone should have access to CMS
E N D
Announcements • All office hours have been finalized. Available on web. • CS 4410 Homework available via CMS • Due next Tuesday, September 9th • CS 4411 initial design documents due Monday, Sept 8th • Project due following Wednesday, September 17th • Everyone should have access to CMS • (http://cms.csuglab.cornell.edu) • Check and contact me (hweather@cs.cornell.edu) or Bill Hogan (whh@cs.cornell.edu) today if you do not have access to CMS • Also, everyone should have CSUGLab account • Contact Bill or I if you do not
Announcements • Optional architecture review session today • 315 Upson Hall @ 6:00pm • Computer Networks CS 4450/5450 has been changed • Will be held at Mon/Wed 8:40-9:55 am in Hollister 110.
Review: Protecting Processes from Each Other • Problem: multiplexing resources (fixed pie, infinite demand) • Run multiple applications in such a way that they are protected from one another • Goal: • Keep User Programs from Crashing OS • Keep User Programs from Crashing each other • [Keep Parts of OS from crashing other parts?] • (Some of the required) Mechanisms: • Dual Mode Operation • Address Translation (base/limit registers, page tables, etc) • Privileged instructions (set timer, I/O, etc) • Simple Policy: • Programs are not allowed to read/write memory of other Programs or of Operating System
Review: OS Structure • Monolithic • Advantages: performance • Disadvantages: difficult to extend, debug, secure, and make reliable • Layered • Advantages: simplicity of construction, debugging, and extensible • Disadvantages: defining layers, performance overhead • Micro-kernel • Advantages: easy to extend, port. More reliable and secure. • Disadvantage: performance overhead • Modular • Advantages: monolithic performance with layered flexibility • Disadvantages: modules can still crash system • Virtual Machines • Advantages: protection/isolation, great systems building tool • Disadvantage: difficult to implement
Review: Virtualization • Very old idea • IBM in 1960’s used term for virtual machines (e.g. CP-40) • 21st century revival (e.g. Denali [Steve Gribble], Xen [Ian Pratt, Steven Hand], VMWare • Is a very broad term • Platform virtualization: involves the simulation of virtual machines • Full virtualization: VM simulates complete HW, runs unmodified OS • IBM’s CP-40, IBM’s z/VM, VMWare Server/Workstation • Partial virtualization (or “address space virtualization): VM simulates much (but not all) HW - each VM consists of different address space • MIT’s CTSS, IBM’s M44/44X • Paravirtualization: VM does not simulate HW, instead offers special API • IBM's CMS, IBM’s z/VM, Xen • Resource virtualization: involves the simulation of combined, fragmented, or simplified resources • RAID, Volume Managers, Storage virtualization, VPN’s, NAT’s • Partioning vs encapsulation
Review: x86 Virtualization • x86 is particularly difficult to fully virtualize. • Hard to present the illusion of a complete set of standard hardware • Significant costs in hypervisor complexity and runtime performance • Recent CPU x86 virtualization instructions for a hypervisor to control ring0 hardware access • Create a new ”Ring -1” • Guest OS can run Ring0 operations natively without affecting other guests or the host OS • both Intel's "Vanderpool" (or VT) and AMD's "Pacifica" (AMD-V) • Although they are mutually incompatible
gcc nfsd emacs ls www lpr Finite Pie, Infinite DemandWhy Processes? Simplicity + Speed • Hundreds of things going on in the system • How to make things simple? • Separate each in an isolated process • Decomposition • How to speed-up? • Overlap I/O bursts of one process with CPU bursts of another nfsd emacs www OS ls lpr OS
Goals for Today • What are processes? • Differences between processes and programs • Creating and running a program • Process details • States • Data structures • Creating new processes • Inter-process communication
What is a process? • A task created by the OS, running in a restricted virtual machine environment –a virtual CPU, virtual memory environment, interface to the OS via system calls • The unit of execution • The unit of scheduling • Thread of execution + address space • Is a program in execution • Sequential, instruction-at-a-time execution of a program. The same as “job” or “task” or “sequential process”
What is a program? A program consists of: • Code: machine instructions • Data: variables stored and manipulated in memory • initialized variables (globals) • dynamically allocated variables (malloc, new) • stack variables (C automatic variables, function arguments) • DLLs: libraries that were not compiled or linked with the program • containing code & data, possibly shared with other programs • mapped files: memory segments containing variables (mmap()) • used frequently in database programs • Whats the relationship between a program and process? • A process is a executing program
compiler/ assembler Linker Header Code Initialized data BSS Symbol table Line numbers Ext. refs Preparing a Program source file .o files static libraries (libc, streams…) Executable file (must follow standard format, such as ELF on Linux, Microsoft PE on Windows)
Running a program • OS creates a “process” and allocates memory for it • The loader: • reads and interprets the executable file • sets process’s memory to contain code & data from executable • pushes “argc”, “argv”, “envp” on the stack • sets the CPU registers properly & calls “__start()” [Part of CRT0] • Program start running at __start(), which calls main() • we say “process” is running, and no longer think of “program” • When main() returns, CRT0 calls “exit()” • destroys the process and returns all resources
Header Code Initialized data BSS BSS Initialized data Symbol table Line numbers Code Ext. refs Process != Program mapped segments DLL’s • Program is passive • Code + data • Process is running program • stack, regs, program counter • Example: • We both run IE: • Same program • Separate processes Stack Heap Executable Process address space
Process States • Many processes in system, only one on CPU • “Execution State” of a process: • Indicates what it is doing • Basically 3 states: • Ready: waiting to be assigned to the CPU • Running: executing instructions on the CPU • Waiting: waiting for an event, e.g. I/O completion • Process moves across different states
Process State Transitions interrupt New Exit admitted done Ready dispatch Running I/O or event completion I/O or event wait Waiting • Processes hop across states as a result of: • Actions they perform, e.g. system calls • Actions performed by OS, e.g. rescheduling • External actions, e.g. I/O
Process Data Structures • OS represents a process using a PCB • Process Control Block • Has all the details of a process Process Id Security Credentials Username of owner Process State General Purpose Registers Queue Pointers Stack Pointer Signal Masks Program Counter Memory Management … Accounting Info
Context Switch • For a running process • All registers are loaded in CPU and modified • E.g. Program Counter, Stack Pointer, General Purpose Registers • When process relinquishes the CPU, the OS • Saves register values to the PCB of that process • To execute another process, the OS • Loads register values from PCB of that process • Context Switch • Process of switching CPU from one process to another • Very machine dependent for types of registers
Details of Context Switching • Very tricky to implement • OS must save state without changing state • Should run without touching any registers • CISC: single instruction saves all state • RISC: reserve registers for kernel • Or way to save a register and then continue • Overheads: CPU is idle during a context switch • Explicit: • direct cost of loading/storing registers to/from main memory • Implicit: • Opportunity cost of flushing useful caches (cache, TLB, etc.) • Wait for pipeline to drain in pipelined processors
How to create a process? • Double click on a icon? • After boot OS starts the first process • E.g. sched for Solaris, ntoskrnel.exe for XP • The first process creates other processes: • the creator is called the parent process • the created is called the child process • the parent/child relationships is expressed by a process tree • For example, in UNIX the second process is called init • it creates all the gettys (login processes) and daemons • it should never die • it controls the system configuration (#processes, priorities…) • Explorer.exe in Windows for graphical interface
Processes Under UNIX • Fork() system call is only way to create a new process • int fork() does many things at once: • creates a new address space (called the child) • copies the parent’s address space into the child’s • starts a new thread of control in the child’s address space • parent and child are equivalent -- almost • in parent, fork() returns a non-zero integer • in child, fork() returns a zero. • difference allows parent and child to distinguish • int fork() returns TWICE!
Example main(int argc, char **argv) { char *myName = argv[1]; int cpid = fork(); if (cpid == 0) { printf(“The child of %s is %d\n”, myName, getpid()); exit(0); } else { printf(“My child is %d\n”, cpid); exit(0); } } What does this program print?
Bizarre But Real lace:tmp<15> cc a.c lace:tmp<16> ./a.out foobar The child of foobar is 23874 My child is 23874 Parent Child fork() retsys v0=23874 v0=0 Operating System
Fork is half the story • Fork() gets us a new address space, • but parent and child share EVERYTHING • memory, operating system state • int exec(char *programName) completes the picture • throws away the contents of the calling address space • replaces it with the program named by programName • starts executing at header.startPC • Does not return • Pros: Clean, simple • Con: duplicate operations
Starting a new program main(int argc, char **argv) { char *myName = argv[1]; char *progName = argv[2]; int cpid = fork(); if (cpid == 0) { printf(“The child of %s is %d\n”, myName, getpid()); execlp(“/bin/ls”, // executable name “ls”, NULL); // null terminated argv printf(“OH NO. THEY LIED TO ME!!!\n”); } else { printf(“My child is %d\n”, cpid); exit(0); } }
Process Termination • Process executes last statement and OS decides(exit) • Output data from child to parent (via wait) • Process’ resources are deallocated by operating system • Parent may terminate execution of child process (abort) • Child has exceeded allocated resources • Task assigned to child is no longer required • If parent is exiting • Some OSes don’t allow child to continue if parent terminates • All children terminated - cascading termination
Why IPC? • Independent vs Cooperating processes • Why let processes cooperate? • Information sharing • Computation speedup • Modularity • Convenience • Two fundamental models • Message Passing • Shared Memory
Shared Memory • Processes establish a segment of memory as shared • Typically part of the memory of the process creating the shared memory. Other processes attach this to their memory space. • Requires processes to agree to remove memory protection for the shared section • Recall that OS normally protects processes from writing in each others memory.
Producer/Consumer using shared memory • Producer process produces information consumed by Consumer process. • Very common paradigm. #define BUFFER_SIZE 10 typedef struct{ ..some stuff.. }item; item buffer[BUFFER_SIZE]; int in = 0 int out = 0;
Producer/Consumer (1/2) • Producer process: item nextProduced; while(true) { /*Produce an item in next produced*/ while(((in + 1) % BUFFER_SIZE) == out) //do nothing… buffer[in] = nextProduced; in = (in + 1) % BUFFER_SIZE; }
Producer/Consumer (2/2) • Consumer process item nextConsumed; while(true) { while(in == out) //do nothing.. nextConsumed = buffer[out]; out = (out + 1) % BUFFER_SIZE; /* Consume item in nextConsumed */ }
Synchronization • The previous code only allows BUFFER_SIZE-1 items at the same time • To remedy this, the processes would need to synchronize their access to the buffer. (This is a large topic, later).
#include <stdio.h> #include <sys/shm.h> #include <sys/stat.h> main(int argc, char **argv) { char* shared_memory; const int size = 4096; int segment_id = shmget(IPC_PRIVATE, size, S_IRUSR |S_IWUSR); int cpid = fork(); if (cpid == 0) { shared_memory = (char*) shmat(segment_id, NULL, 0); sprintf(shared_memory, "Hi from process %d",getpid()); } else { wait(NULL); shared_memory = (char*) shmat(segment_id, NULL, 0); printf("Process %d read: %s\n", getpid(), shared_memory); shmdt(shared_memory); shmctl(segment_id, IPC_RMID, NULL); } } Examp le
Message Passing • Send(P, msg): Send msg to process P • Recv(Q, msg): Receive msg from process Q • Typically requires kernel intervention • Naming: • Hardcode sender/receiver • Indirection using mailboxes/ports
Synchronization • Possible primitives: • Blocking send/receive • Non-blocking send/receive • Also known as synchronous and asynchronous. • When both send and receive are blocking, we have a rendezvous between the processes. Other combinations need buffering.
Buffering • Zero capacity buffer • Needs synchronous sender. • Bounded capacity buffer • If the buffer is full, the sender blocks. • Unbounded capacity buffer • The sender never blocks.
Summary • The unit of execution and scheduling • Thread of execution (next time) + address space • A task created by the OS, running in a restricted virtual machine environment –a virtual CPU, virtual memory environment, interface to the OS via system calls • Sequential, instruction-at-a-time execution of a program. Operating system abstraction to represent what is needed to run a single, multithreaded program • Program != Process • A process is a program in execution • Abstraction used for protection • Main Memory State (contents of Address Space) • Multiprogramming: overlap IO and CPU • Context Switches are expensive