ecs150 Fall 2007 : Operating System #2: Scheduling and Mutual Exclusion (chapter 4)

ecs150 Fall 2007:Operating System#2: Scheduling and Mutual Exclusion(chapter 4) Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.cs.ucdavis.edu/~wu/ sfelixwu@gmail.com ecs150 Fall 2007

Kernel and User Space Memory space for this process Process FOO program System call (or trap into the kernel) conceptually Process FOO in the Kernel System Call Kernel Resources (disk or IO devices) ecs150 Fall 2007

Running Waiting Ready States of a Process • Running, Blocked, and Ready ecs150 Fall 2007

Running Running Running Running Blocked Blocked Blocked Blocked Ready Ready Ready Ready Scheduling & Context Switching ecs150 Fall 2007

Basic Concepts • Maximum CPU utilization obtained with multiprogramming • CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait. • CPU burst distribution • Processes classified as CPU bound or I/O bound ecs150 Fall 2007

CPU Scheduler • Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them. • CPU scheduling decisions may take place when a process: 1. Switches from running to waiting state. 2. Switches from running to ready state. 3. Switches from waiting to ready. 4. Terminates. • Scheduling under 1 and 4 is nonpreemptive. • All other scheduling is preemptive. ecs150 Fall 2007

Preemptive vs.Nonpreemptive • Preemptive: • Nonpreemptive: • Pros and Cons….. ecs150 Fall 2007

User P  Kernel Process • Kernel P  Kernel Process ecs150 Fall 2007

Context Switching ecs150 Fall 2007

Dispatcher • Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves: • switching context • switching to user mode • jumping to the proper location in the user program to restart that program • Dispatch latency – time it takes for the dispatcher to stop one process and start another running. ecs150 Fall 2007

Preemptive Scheduling • fixed time window timer/clock interrupt • scheduling decision • “bill” the process • context switching might or might not happen • Priority • Fairness • … ecs150 Fall 2007

Example: two schedules 100 msec slot with around 5 msec context switching overhead…. Which one is better? And Why? ecs150 Fall 2007

Optimization Criteria • Max CPU utilization • Max throughput • Min turnaround time • Min waiting time • Min response time • but obviously, we have some trade-off… ecs150 Fall 2007

P1 P2 P3 0 24 27 30 First-Come, First-Served (FCFS) ProcessBurst Time P1 24 P2 3 P3 3 • Suppose that the processes arrive in the order: P1 , P2 , P3 The Gantt Chart for the schedule is: • Waiting time for P1 = 0; P2 = 24; P3 = 27 • Average waiting time: (0 + 24 + 27)/3 = 17 ecs150 Fall 2007

Example 1 ecs150 Fall 2007

Shortest-Job-First (SJF) • Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time. • Two schemes: • nonpreemptive – once CPU given to the process it cannot be preempted until completes its CPU burst. • preemptive – if a new process arrives with CPU burst length less than remaining time of current executing process, preempt. This scheme is know as the Shortest-Remaining-Time-First (SRTF). • SJF is optimal – gives minimum average waiting time for a given set of processes. ecs150 Fall 2007

P1 P3 P2 P4 0 3 7 8 12 16 Example Process Arrival TimeBurst Time P1 0.0 7 P2 2.0 4 P3 4.0 1 P4 5.0 4 • SJF (non-preemptive) • Average waiting time = (0 + 6 + 3 + 7)/4 - 4 ecs150 Fall 2007

Issues for SJF • ??? ecs150 Fall 2007

CPU Bursts • CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait. • CPU burst distribution • Processes classified as CPU bound or I/O bound ecs150 Fall 2007

Determining Length of Next CPU Burst • Can only estimate the length. • Can be done by using the length of previous CPU bursts, using exponential averaging. ecs150 Fall 2007

ecs150 Fall 2007

Priority Scheduling • A priority number (integer) is associated with each process • The CPU is allocated to the process with the highest priority (smallest integer  highest priority). • Preemptive • Non-preemptive • SJF is a priority scheduling where priority is the predicted next CPU burst time. • FCFS is a priority scheduling where priority is the arrival time. ecs150 Fall 2007

“Fixed” Priority • What is it? • The process sticks with the origin assigned priority. • A good or bad idea? • What other possible policy? • Dynamic policy. • Problem  Starvation – low priority processes may never execute. • Solution  Aging – as time progresses increase the priority of the process. • Hybrid policy. ecs150 Fall 2007

Round Robin (RR) • Each process gets a small unit of CPU time (time quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue. • If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units. • Performance • q large  FIFO • q small  q must be large with respect to context switch, otherwise overhead is too high. ecs150 Fall 2007

P1 P2 P3 P4 P1 P3 P4 P1 P3 P3 0 20 37 57 77 97 117 121 134 154 162 ProcessBurst Time P1 53 P2 17 P3 68 P4 24 • The Gantt chart is: • Typically, higher average turnaround than SJF, but better response. ecs150 Fall 2007

I/O system calls 1 2 3 ecs150 Fall 2007

Past CPU Usage • Round-Robin • How many CPU cycles so far • fairness • How about #3? ecs150 Fall 2007

Past CPU Usage • Round-Robin • How many CPU cycles so far • fairness • How many CPU cycles you have used lately • aging factor: 4 cycles ecs150 Fall 2007

BSD Scheduling • For each tick (10 msec): • Piestcpu = Piestcpu + Tick(Pi) • For each decaying period (1 second): • Piestcpu = Piestcpu*(2*load)/(2*load+1)+Pinice • load: # of processes in the ready queue • Example: 1 process in the queue • Piestcpu = Piestcpu*0.66 + Pinice • in 4 seconds, (0.66)4 = 0.001296 ecs150 Fall 2007

Linux • Linux provides two process scheduling algorithms • Time sharing for fairness • Real-time, where priorities are more important than fairness • Linux allows only user-mode processes to be preempted • kernel mode processes may not be interrupted • Formula – a balance between software real-time and fairness • Picredit = Picredit / 2 + Priorityi ecs150 Fall 2007

Scheduling • Fairness: Round-Robin • Responsiveness: SJF • Utilization: Aging, promoting System Calls. • Fairness in Inter-process communication ecs150 Fall 2007

BSD Scheduling • For each tick (10 msec): • Piestcpu = Piestcpu + Tick(Pi) • For each decaying period (1 second): • Piestcpu = Piestcpu*(2*load)/(2*load+1)+Pinice • load: # of processes in the ready queue • Example: 1 process in the queue • Piestcpu = Piestcpu*0.66 + Pinice • in 4 seconds, (0.66)4 = 0.001296 ecs150 Fall 2007

Load = 1 • 2Load/(2Load+1) ~ 0.66 • Piestcpu = Piestcpu*(0.66)1 – 20 • Piestcpu = Piestcpu*(0.66)2 – 20 • Piestcpu = Piestcpu*(0.66)3 – 20 • Piestcpu = Piestcpu*(0.66)4 – 20 • (0.66)4 = 0.001296 ecs150 Fall 2007

Load = 50 • 2Load/(2Load+1) ~ 0.99 • Piestcpu = Piestcpu*(0.99)1 – 20 • Piestcpu = Piestcpu*(0.99)2 – 20 • Piestcpu = Piestcpu*(0.99)3 – 20 • Piestcpu = Piestcpu*(0.99)4 – 20 • (0.99)4 = 0.9606 ecs150 Fall 2007

1 RR 0 0 : : . 256 different priorities 64 scheduling classes 1 0 1 ecs150 Fall 2007

kg_estcpu 1 RR 0 0 : : . 256 different priorities 64 scheduling classes 0~63 bottom-half kernel (interrupt) 64~127 top-half kernel 128~159 real-time user 160~223 timeshare 224~255 idle 1 0 1 ecs150 Fall 2007

ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c static void schedcpu(void *arg) { ... FOREACH_PROC_IN_SYSTEM(p) { FOREACH_KSEGRP_IN_PROC(p, kg) { awake = 0; FOREACH_KSE_IN_GROUP(kg, ke) { ... } /* end of kse loop */ kg->kg_estcpu = decay_cpu(loadfac, kg->kg_estcpu); resetpriority(kg); FOREACH_THREAD_IN_GROUP(kg, td) { if (td->td_priority >= PUSER) { sched_prio(td, kg->kg_user_pri); } } } /* end of ksegrp loop */ } /* end of process loop */ } ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c static void resetpriority(struct ksegrp *kg) { register unsigned int newpriority; struct thread *td; if (kg->kg_pri_class == PRI_TIMESHARE) { newpriority = PUSER + kg->kg_estcpu / INVERSE_ESTCPU_WEIGHT + NICE_WEIGHT * (kg->kg_nice - PRIO_MIN); newpriority = min(max(newpriority, PRI_MIN_TIMESHARE), PRI_MAX_TIMESHARE); kg->kg_user_pri = newpriority; } FOREACH_THREAD_IN_GROUP(kg, td) { maybe_resched(td); /* XXXKSE silly */ } } ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c struct kse * sched_choose(void) { struct kse *ke; ke = runq_choose(&runq); if (ke != NULL) { runq_remove(&runq, ke); ke->ke_state = KES_THREAD; KASSERT((ke->ke_thread != NULL), ("runq_choose: No thread on KSE")); KASSERT((ke->ke_thread->td_kse != NULL), ("runq_choose: No KSE on thread")); KASSERT(ke->ke_proc->p_sflag & PS_INMEM, ("runq_choose: process swapped out")); } return (ke); } ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c #define loadfactor(loadav) (2 * (loadav)) #define decay_cpu(loadfac, cpu) (((loadfac) * (cpu)) / ((loadfac) + FSCALE)) static fixpt_t ccpu = 0.95122942450071400909 * FSCALE; #define CCPU_SHIFT 11 ecs150 Fall 2007

/usr/src/sys/kern/kern_synch.c /* * Compute a tenex style load average of a quantity on * 1, 5 and 15 minute intervals. * XXXKSE Needs complete rewrite when correct info is * available. * Completely Bogus.. * only works with 1:1 (but compiles ok now :-) */ static void loadav(void *arg) { } ecs150 Fall 2007

Lottery Scheduling(a dollar and a dream) • A process P(I) has L(I) lottery tickets. • Totally, we have N tickets among all processes. • P(I) has L(I)/N probability to get CPU cycles. • Difference between Priority and Tickets: • Priority: which process is more important? • The importance is “infinitely” big!! • Tickets: how much more important? • Quantify the importance. ecs150 Fall 2007

J, Pri(J) = 1 I, Pri(I) = 120 J,80% I J I I J J J I J J J J J J J I J J J J I J J ecs150 Fall 2007

Lottery Scheduling • At each scheduling decision point: • Ticket assignment: • For each ticket in each process, produce a unique random number (we can not have more than one process sharing the prize). • Ticket drawing: • Produce another random number until a winner is identified. ecs150 Fall 2007

Efficient Implementation • Maybe only the # of tickets matters… • One rand( ) could produce 32 random bits. 0-16 ecs150 Fall 2007

4 2 2 Example ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c static void schedcpu(void *arg) { ... FOREACH_PROC_IN_SYSTEM(p) { mtx_lock_spin(&sched_lock); FOREACH_KSEGRP_IN_PROC(p, kg) { awake = 0; FOREACH_KSE_IN_GROUP(kg, ke) { ... ke->ke_sched->ske_cpticks = 0; } /* end of kse loop */ /* * If there are ANY running threads in this KSEGRP, * then don't count it as sleeping. */ ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c if (awake) { ... kg->kg_slptime = 0; } else { kg->kg_slptime++; } if (kg->kg_slptime > 1) continue; kg->kg_estcpu = decay_cpu(loadfac, kg->kg_estcpu); resetpriority(kg); FOREACH_THREAD_IN_GROUP(kg, td) { if (td->td_priority >= PUSER) { sched_prio(td, kg->kg_user_pri); } } } /* end of ksegrp loop */ ecs150 Fall 2007

/usr/src/sys/kern/sched_4bsd.c void sched_clock(struct kse *ke) { struct ksegrp *kg; struct thread *td; mtx_assert(&sched_lock, MA_OWNED); kg = ke->ke_ksegrp; td = ke->ke_thread; ke->ke_sched->ske_cpticks++; kg->kg_estcpu = ESTCPULIM(kg->kg_estcpu + 1); if ((kg->kg_estcpu % INVERSE_ESTCPU_WEIGHT) == 0) { resetpriority(kg); if (td->td_priority >= PUSER) td->td_priority = kg->kg_user_pri; } } ecs150 Fall 2007

ecs150 Fall 2007 : Operating System #2: Scheduling and Mutual Exclusion (chapter 4)