ecs150 Fall 2011 : Operating System #2: Scheduling & Mutual Exclusion

ecs150 Fall 2011:Operating System#2: Scheduling & Mutual Exclusion Dr. S. Felix Wu Computer Science Department University of California, Davis ecs150 Fall 2011

Kernel and User Space Memory space for this process Process FOO program System call (or trap into the kernel) conceptually Process FOO in the Kernel System Call Kernel Resources (disk or IO devices) ecs150 Fall 2011

Processes > ps PID TTY TIME CMD 2910 pts/4 0:00 tcsh > ps -ef UID PID PPID C STIME TTY TIME CMD root 0 0 0 Sep 25 ? 0:01 sched root 1 0 0 Sep 25 ? 0:00 /etc/init - root 2 0 0 Sep 25 ? 0:00 pageout root 3 0 0 Sep 25 ? 0:01 fsflush root 223 1 0 Sep 25 ? 0:00 /usr/lib/utmpd root 179 1 0 Sep 25 ? 0:00 /usr/sbin/cron root 273 1 0 Sep 25 ? 0:00 /usr/lib/saf/sac -t 300 root 56 1 0 Sep 25 ? 0:00 /usr/lib/devfsadm/devfseventd root 58 1 0 Sep 25 ? 0:00 /usr/lib/devfsadm/devfsadmd root 106 1 0 Sep 25 ? 0:00 /usr/sbin/rpcbind root 197 1 0 Sep 25 ? 0:01 /usr/sbin/nscd root 108 1 0 Sep 25 ? 0:00 /usr/sbin/keyserv root 168 1 0 Sep 25 ? 0:00 /usr/sbin/syslogd root 118 1 0 Sep 25 ? 0:00 /usr/lib/netsvc/yp/ypbind root 159 1 0 Sep 25 ? 0:00 /usr/lib/autofs/automountd ecs150 Fall 2011

ecs150 Fall 2011

Running Waiting Ready States of a Process • Running, Blocked, and Ready ecs150 Fall 2011

Running Running Running Running Blocked Blocked Blocked Blocked Ready Ready Ready Ready Scheduling & Context Switching ecs150 Fall 2011

kg_estcpu 1 RR 0 0 : : . 256 different priorities 64 scheduling classes 0~63 bottom-half kernel (interrupt) 64~127 top-half kernel 128~159 real-time user 160~223 timeshare 224~255 idle 1 0 1 ecs150 Fall 2011

/usr/src/sys/kern/sched_4bsd.c static void schedcpu(void *arg) { ... FOREACH_PROC_IN_SYSTEM(p) { FOREACH_KSEGRP_IN_PROC(p, kg) { awake = 0; FOREACH_KSE_IN_GROUP(kg, ke) { ... } /* end of kse loop */ kg->kg_estcpu = decay_cpu(loadfac, kg->kg_estcpu); resetpriority(kg); FOREACH_THREAD_IN_GROUP(kg, td) { if (td->td_priority >= PUSER) { sched_prio(td, kg->kg_user_pri); } } } /* end of ksegrp loop */ } /* end of process loop */ } ecs150 Fall 2011

Scheduling Basics • Maximum CPU utilization obtained with multiprogramming • CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait. • CPU burst distribution • Processes classified as CPU bound or I/O bound ecs150 Fall 2011

Context Switching ecs150 Fall 2011

Dispatcher • Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves: • switching context • switching to user mode • jumping to the proper location in the user program to restart that program • Dispatch latency – time it takes for the dispatcher to stop one process and start another running. ecs150 Fall 2011

ecs150 Fall 2011

5.x Kernel ecs150 Fall 2011

CPU Scheduler • Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them. • CPU scheduling decisions may take place when a process: 1. Switches from running to waiting state. 2. Switches from running to ready state. 3. Switches from waiting to ready. 4. Terminates. • Scheduling under 1 and 4 is nonpreemptive. • All other scheduling is preemptive. ecs150 Fall 2011

Preemptive vs.Nonpreemptive • Preemptive: • Nonpreemptive: • Pros and Cons….. ecs150 Fall 2011

User P  Kernel Process • Kernel P  Kernel Process ecs150 Fall 2011

Preemptive Scheduling • fixed time window timer/clock interrupt • scheduling decision • “bill” the process • context switching might or might not happen • Priority • Fairness • … ecs150 Fall 2011

Example: two schedules 100 msec slot with around 5 msec context switching overhead…. Which one is better? And Why? ecs150 Fall 2011

Optimization Criteria • Max CPU utilization • Max throughput • Min turnaround time • Min waiting time • Min response time • but obviously, we have some trade-off… ecs150 Fall 2011

P1 P2 P3 0 24 27 30 First-Come, First-Served (FCFS) ProcessBurst Time P1 24 P2 3 P3 3 • Suppose that the processes arrive in the order: P1 , P2 , P3 The Gantt Chart for the schedule is: • Waiting time for P1 = 0; P2 = 24; P3 = 27 • Average waiting time: (0 + 24 + 27)/3 = 17 ecs150 Fall 2011

Example 1 ecs150 Fall 2011

Shortest-Job-First (SJF) • Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time. • Two schemes: • nonpreemptive – once CPU given to the process it cannot be preempted until completes its CPU burst. • preemptive – if a new process arrives with CPU burst length less than remaining time of current executing process, preempt. This scheme is know as the Shortest-Remaining-Time-First (SRTF). • SJF is optimal – gives minimum average waiting time for a given set of processes. ecs150 Fall 2011

P1 P3 P2 P4 0 3 7 8 12 16 Example Process Arrival TimeBurst Time P1 0.0 7 P2 2.0 4 P3 4.0 1 P4 5.0 4 • SJF (non-preemptive) • Average waiting time = (0 + 6 + 3 + 7)/4 - 4 ecs150 Fall 2011

Issues for SJF • ??? ecs150 Fall 2011

Hard to Predict? ecs150 Fall 2011

CPU Bursts • CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait. • CPU burst distribution • Processes classified as CPU bound or I/O bound ecs150 Fall 2011

Determining Length of Next CPU Burst • Can only estimate the length. • Can be done by using the length of previous CPU bursts, using exponential averaging. ecs150 Fall 2011

ecs150 Fall 2011

Priority Scheduling • A priority number (integer) is associated with each process • The CPU is allocated to the process with the highest priority (smallest integer  highest priority). • Preemptive • Non-preemptive • SJF is a priority scheduling where priority is the predicted next CPU burst time. • FCFS is a priority scheduling where priority is the arrival time. ecs150 Fall 2011

“Fixed” Priority • What is it? • The process sticks with the origin assigned priority. • A good or bad idea? • What other possible policy? • Dynamic policy. • Problem  Starvation – low priority processes may never execute. • Solution  Aging – as time progresses increase the priority of the process. • Hybrid policy. ecs150 Fall 2011

Round Robin (RR) • Each process gets a small unit of CPU time (time quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue. • If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units. • Performance • q large  FIFO • q small  q must be large with respect to context switch, otherwise overhead is too high. ecs150 Fall 2011

P1 P2 P3 P4 P1 P3 P4 P1 P3 P3 0 20 37 57 77 97 117 121 134 154 162 ProcessBurst Time P1 53 P2 17 P3 68 P4 24 • The Gantt chart is: • Typically, higher average turnaround than SJF, but better response. ecs150 Fall 2011

Priority Scheduling • A priority number (integer) is associated with each process • The CPU is allocated to the process with the highest priority (smallest integer  highest priority). • Preemptive • Non-preemptive • SJF is a priority scheduling where priority is the predicted next CPU burst time. • FCFS is a priority scheduling where priority is the arrival time. ecs150 Fall 2011

I/O system calls 1 2 3 ecs150 Fall 2011

Past CPU Usage • Round-Robin • Insensitive to the CPU usage! • How many CPU cycles so far? • fairness ecs150 Fall 2011

Past CPU Usage • Round-Robin • Insensitive to the CPU usage! • How many CPU cycles so far? • fairness • How many CPU cycles you have used lately? • aging ecs150 Fall 2011

BSD Scheduling • For each tick (10 msec): • Piestcpu = Piestcpu + Tick(Pi) • For each decaying period (1 second): • Piestcpu = Piestcpu*(2*load)/(2*load+1)+Pinice • load: # of processes in the ready queue • Example: 1 process in the queue • Piestcpu = Piestcpu*0.66 + Pinice • in 4 seconds, (0.66)4 = 0.001296 ecs150 Fall 2011

Load = 1 • 2Load/(2Load+1) ~ 0.66 • Piestcpu = Piestcpu*(0.66)1 – 20 • Piestcpu = Piestcpu*(0.66)2 – 20 • Piestcpu = Piestcpu*(0.66)3 – 20 • Piestcpu = Piestcpu*(0.66)4 – 20 • (0.66)4 = 0.001296 ecs150 Fall 2011

Load = 50 • 2Load/(2Load+1) ~ 0.99 • Piestcpu = Piestcpu*(0.99)1 – 20 • Piestcpu = Piestcpu*(0.99)2 – 20 • Piestcpu = Piestcpu*(0.99)3 – 20 • Piestcpu = Piestcpu*(0.99)4 – 20 • (0.99)4 = 0.9606 ecs150 Fall 2011

ecs150 Fall 2011

/usr/src/sys/kern/sched_4bsd.c static void schedcpu(void *arg) { ... FOREACH_PROC_IN_SYSTEM(p) { FOREACH_KSEGRP_IN_PROC(p, kg) { awake = 0; FOREACH_KSE_IN_GROUP(kg, ke) { ... } /* end of kse loop */ kg->kg_estcpu = decay_cpu(loadfac, kg->kg_estcpu); resetpriority(kg); FOREACH_THREAD_IN_GROUP(kg, td) { if (td->td_priority >= PUSER) { sched_prio(td, kg->kg_user_pri); } } } /* end of ksegrp loop */ } /* end of process loop */ } ecs150 Fall 2011

/usr/src/sys/kern/sched_4bsd.c static void resetpriority(struct ksegrp *kg) { register unsigned int newpriority; struct thread *td; if (kg->kg_pri_class == PRI_TIMESHARE) { newpriority = PUSER + kg->kg_estcpu / INVERSE_ESTCPU_WEIGHT + NICE_WEIGHT * (kg->kg_nice - PRIO_MIN); newpriority = min(max(newpriority, PRI_MIN_TIMESHARE), PRI_MAX_TIMESHARE); kg->kg_user_pri = newpriority; } FOREACH_THREAD_IN_GROUP(kg, td) { maybe_resched(td); /* XXXKSE silly */ } } ecs150 Fall 2011

/usr/src/sys/kern/sched_4bsd.c #define loadfactor(loadav) (2 * (loadav)) #define decay_cpu(loadfac, cpu) (((loadfac) * (cpu)) / ((loadfac) + FSCALE)) static fixpt_t ccpu = 0.95122942450071400909 * FSCALE; #define CCPU_SHIFT 11 ecs150 Fall 2011

/usr/src/sys/kern/kern_synch.c ~ 5.4/7.0 /* * Compute a tenex style load average of a quantity on * 1, 5 and 15 minute intervals. * XXXKSE Needs complete rewrite when correct info is * available. * Completely Bogus.. * only works with 1:1 (but compiles ok now :-) */ static void loadav(void *arg) { } ecs150 Fall 2011

/usr/src/sys/kern/sched_4bsd.c struct kse * sched_choose(void) { struct kse *ke; ke = runq_choose(&runq); if (ke != NULL) { runq_remove(&runq, ke); ke->ke_state = KES_THREAD; KASSERT((ke->ke_thread != NULL), ("runq_choose: No thread on KSE")); KASSERT((ke->ke_thread->td_kse != NULL), ("runq_choose: No KSE on thread")); KASSERT(ke->ke_proc->p_sflag & PS_INMEM, ("runq_choose: process swapped out")); } return (ke); } ecs150 Fall 2011

kg_estcpu 1 RR 0 0 : : . 256 different priorities 64 scheduling classes 0~63 bottom-half kernel (interrupt) 64~127 top-half kernel 128~159 real-time user 160~223 timeshare 224~255 idle 1 0 1 ecs150 Fall 2011

Lottery Scheduling(a dollar and a dream) • A process P(I) has L(I) lottery tickets. • Totally, we have N tickets among all processes. • P(I) has L(I)/N probability to get CPU cycles. • Difference between Priority and Tickets: • Priority: which process is more important? • The importance is “infinitely” big!! • Tickets: how much more important? • Quantify the importance. ecs150 Fall 2011

J, Pri(J) = 1 I, Pri(I) = 120 J,80% I J I I J J J I J J J J J J J I J J J J I J J ecs150 Fall 2011

Lottery Scheduling • At each scheduling decision point: • Ticket assignment: • For each ticket in each process, produce a unique random number (we can not have more than one process sharing the prize). • Ticket drawing: • Produce another random number until a winner is identified. ecs150 Fall 2011

Efficient Implementation • Maybe only the # of tickets matters… • One rand( ) could produce 32 random bits. 0-16 ecs150 Fall 2011

ecs150 Fall 2011 : Operating System #2: Scheduling & Mutual Exclusion