Efficient CPU Scheduling Strategies for Operating Systems

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling • Basic Concepts • Scheduling Criteria • Scheduling Algorithms • Thread Scheduling • Multiple-Processor Scheduling • Operating Systems Examples • Algorithm Evaluation

Objectives • To introduce CPU scheduling, which is the basis for multiprogrammed operating systems • To describe various CPU-scheduling algorithms • To discuss evaluation criteria for selecting a CPU-scheduling algorithm for a particular system

Basic Concepts • Process execution consists of a cycle of • CPU burst: CPU running process code • I/O burst: process waiting for I/O resource • Goal: Maximize CPU utilization • Requires multitasking and good scheduling

Histogram of CPU-burst Times • I/O-bound processspends more time doing I/O than computations, many short CPU bursts • CPU-bound processspends more time doing computations; few very long CPU bursts

CPU Scheduler • Short-term scheduler selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them • CPU scheduling decisions may take place when a process: 1. Switches from running to waiting state 2. Switches from running to ready state 3. Switches from waiting to ready state 4. Terminates • Scheduling under 1 and 4 is nonpreemptive • Once a process is running, it keeps running until it releases the CPU • Windows 3.X • All other scheduling is preemptive • Processes can be forced to release the CPU before they’re done

Dispatcher • Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves: • switching context • switching to user mode • jumping to the proper location in the user program to restart that program • Dispatch latency – time it takes for the dispatcher to stop one process and start another running

Scheduling Criteria • CPU utilization – keep the CPU as busy as possible • Throughput – # of processes that complete their execution per time unit • Turnaround time – amount of time to execute a particular process • Waiting time – amount of time a process has been waiting in the ready queue • Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (for time-sharing environment)

Scheduling Criteria • Maximize CPU utilization • Maximize throughput • Minimize turnaround time • Minimize waiting time • Minimize response time

Scheduling Algorithms • First-Come, First Serve Scheduling • Shortest-Job-First Scheduling • Shortest-Remaining-Time-First Scheduling • Priority Scheduling • Round-Robin Scheduling • Multilevel Queue Scheduling • Multilevel Feedback Queue Scheduling

P1 P2 P3 0 24 27 30 First-Come, First-Served (FCFS) Scheduling ProcessBurst Time P1 24 P2 3 P3 3 • Suppose that the processes arrive in the order: P1 , P2 , P3 The Gantt Chart for the schedule is: • Waiting time for P1 = 0; P2 = 24; P3 = 27 • Average waiting time: (0 + 24 + 27)/3 = 17

P2 P3 P1 0 3 6 30 FCFS Scheduling (Cont) Suppose that the processes arrive in the order P2 , P3 , P1 • The Gantt chart for the schedule is: • Waiting time for P1 = 6;P2 = 0; P3 = 3 • Average waiting time: (6 + 0 + 3)/3 = 3 • Much better than previous case • Convoy effect: short process wait for one long process to finish

P2 P3 P1 0 3 6 30 P1 P2 P3 0 24 27 30 FCFS Scheduling (Cont) • Turnaround time for P1 = 24; P2 = 27; P3 = 30 (A = 27) • Response time for P1 = 0; P2 = 24; P3 = 27 (A = 17) • Waiting time for P1 = 0; P2 = 24; P3 = 27 (A = 17) • Turnaround time for P1 = 30; P2 = 3; P3 = 6 (A = 13) • Response time for P1 = 6; P2 = 0; P3 = 3 (A = 3) • Waiting time for P1 = 6; P2 = 0; P3 = 3 (A = 3)

Shortest-Job-First (SJF) Scheduling • Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time • SJF is optimal – gives minimum average waiting time for a given set of processes • The difficulty is knowing the length of the next CPU request

P3 P2 P4 P1 3 9 16 24 0 Example of SJF Process Burst Time P1 6 P2 8 P3 7 P4 3 • SJF scheduling chart • Average waiting time = (3 + 16 + 9 + 0) / 4 = 7 • FCFS doing 1-2-3-4 = 10.25

Shortest-Remaining-Time-First Scheduling • What if the processes aren’t all scheduled at the same time? • And what if a new process is shorter than the one currently running? • We need a preemptive version of SJF

Example of SRTF Process Arrival TimeBurst Time P1 0.0 6 P2 0.0 7 P3 1.0 4 P4 6.0 5 • Average waiting time = (4 + 15 + 0 + 4) / 4 = 5.75 • SJF = (0 + 15 + 5 + 4) / 4 = 6 • FCFS = (0 + 6 + 12 + 11) / 4 = 7.25 P1 P3 P1 P4 P2 1 22 0 5 10 15

Determining Length of Next CPU Burst • Can only estimate the length • Can be done by using the length of previous CPU bursts, using exponential averaging

Examples of Exponential Averaging •  =0 • n+1 = n • Recent history does not count •  =1 • n+1 = tn • Only the actual last CPU burst counts • If we expand the formula, we get: n+1 =  tn+(1 - ) n n+1 =  tn+(1 - ) ( tn-1+(1 - ) n-1) n+1 =  tn+(1 - ) tn-1+ (1 - )² n-1 n+1 =  tn+(1 - ) tn-1+ (1 - )² ( tn-2+(1 - ) n-2) n+1 =  tn+(1 - ) tn-1+ (1 - )²  tn-2+ (1 - )³ n-2 … n+1 =  tn+(1 - ) tn-1+ (1 - )²  tn-2+…+ (1 - )j  tn-j+…+ (1 - )n+1 0 • Since both  and (1 - ) are less than 1, each successive term has less weight than its predecessor

Prediction of the Length of the Next CPU Burst  =0.5

Priority Scheduling • A priority number (integer) is associated with each process • The CPU is allocated to the process with the highest priority (we’ll define smallest integer = highest priority) • Preemptive • Nonpreemptive • Priority can be many things • Interactive process vs. background process • In real-time systems: time to execution deadline • SJF: priority is the predicted next CPU burst time • Problem  Starvation– low priority processes may never execute • Solution  Aging– as time progresses increase the priority of the process

Priority Scheduling Process PriorityBurst Time P1 0 6 P2 4 7 P3 6 4 P4 1 5 • Assuming they all arrive at time 0 • Average waiting time = (0 + 11 + 18 + 6) / 4 = 8.75 • SJF = (9 + 13 + 0 + 4) / 4 = 6.5 • FCFS doing 1-2-3-4 = (0 + 6 + 13 + 17) / 4 = 9 P1 P4 P2 P3 22 0 6 11 18

Round Robin (RR) • Each process gets a small unit of CPU time (time quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue. • Implement ready queue as FIFO queue, each new process added to end of queue, next process to run is start of queue • If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units. • Performance • q large  FIFO • q small  q must be large with respect to context switch, otherwise overhead is too high

P1 P2 P3 P2 P3 P1 P3 P1 P1 P3 P1 16 0 0 12 16 20 24 28 30 30 4 4 8 Example of RR with Time Quantum = 4 ProcessBurst Time P1 14 P2 4 P3 12 • The Gantt chart is: Compare with SJF: • More generally: Worse waiting time and turnaround time, but better response time than SJF • Turnaround time (30+8+28) / 3 = 22 • Response time (0+4+8) / 3 = 4 • Waiting time (16+4+16) / 3 = 12 • Turnaround time (30+4+16) / 3 = 12.5 • Response time (16+0+4) / 3 = 6.6 • Waiting time (16+0+4) / 3 = 6.6

Time Quantum and Context Switch Time

Turnaround Time Varies With The Time Quantum

Multilevel Queue • Ready queue is partitioned into separate queues:foreground (interactive)background (batch) • Each queue has its own scheduling algorithm • foreground – RR • background – FCFS • Scheduling must be done between the queues • Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation. • Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., 80% to foreground in RR, 20% to background in FCFS

Multilevel Queue Scheduling

Multilevel Feedback Queue • A process can move between the various queues • Aging can be implemented this way • CPU-bound processes moved down, I/O-bound & interactive processes moved up • Old processes in lower queues moved up • Multilevel-feedback-queue scheduler defined by the following parameters: • number of queues • scheduling algorithms for each queue • method used to determine when to upgrade a process • method used to determine when to demote a process • method used to determine which queue a process will enter when that process needs service

Example of Multilevel Feedback Queue • Three queues: • Q0 – RR with time quantum 8 milliseconds • Q1 – RR time quantum 16 milliseconds • Q2 – FCFS • Scheduling • A new job enters queue Q0which is servedFCFS. When it gains CPU, job receives 8 milliseconds. If it does not finish in 8 milliseconds, job is moved to queue Q1. • At Q1 job is again served FCFS and receives 16 additional milliseconds. If it still does not complete, it is preempted and moved to queue Q2.

Thread Scheduling • Distinction between user-level and kernel-level threads • Many-to-one and many-to-many models, thread library schedules user-level threads to run on LWP • Known as process-contention scope (PCS) since scheduling competition is within the process • Usually priority scheduling; thread priority set by programmer • Kernel thread scheduled onto available CPU is system-contention scope (SCS) – competition among all threads in system

Pthread Scheduling • API allows specifying contention scope during thread creation • PTHREAD_SCOPE_PROCESS schedules threads using PCS scheduling • PTHREAD_SCOPE_SYSTEM schedules threads using SCS scheduling. • Only scope allowed on some OS, like Linux and Mac OS X

Pthread Scheduling API #include <pthread.h> #include <stdio.h> #define NUM THREADS 5 int main(int argc, char *argv[]) { int i; pthread t tid[NUM THREADS]; pthread attr t attr; pthread attr init(&attr); /*get default attributes*/ /* set scheduling algorithm PROCESS or SYSTEM */ pthread attr setscope(&attr, PTHREAD_SCOPE_SYSTEM); /* set the scheduling policy - FIFO, RT, or OTHER */ pthread attr setschedpolicy(&attr, SCHED_OTHER); for (i = 0; i < NUM THREADS; i++) pthread create(&tid[i],&attr,runner,NULL); /* create the threads */ for (i = 0; i < NUM THREADS; i++) pthread join(tid[i], NULL); /* join on each thread */ } /* Each thread will begin control in this function */ void *runner(void *param) { printf("I am a thread\n"); pthread exit(0); }

Multiple-Processor Scheduling • CPU scheduling more complex when multiple CPUs are available • We assume homogeneous processors, i.e. all processors of the system are identical • Asymmetric multiprocessing • Master processor in charge of scheduling, accessing system resources & data structures • Simpler, alleviates the need for data sharing • Symmetric multiprocessing • Each processor is self-scheduling • All processes in common ready queue, or each has its own private queue of ready processes • All processors have access to resources & data

NUMA and CPU Scheduling • Processor affinity – process has affinity for processor on which it is currently running • soft affinity: Preference for a processor, but no guarantees • hard affinity: Forced to stay with a processor

Load Balancing • Need to balance processes between CPUs • Otherwise some are overworked while others are idle • Only an issue if each CPU has its own private ready queue, not if they all share the same ready queue • Solution: process migration • Push Migration: Balancing process checks the CPU queues, moves (pushes) processes to balance • Pull Migration: Idle CPU takes (pulls) a process for a busy CPU’s queue • Linux and FreeBSD implement both migrations at once

Operating System Examples • Solaris scheduling • Windows XP scheduling • Linux scheduling

Solaris Dispatch Table

Windows XP Priorities Class Relative Priority

Linux Scheduling Real-time range Nice value

Selecting a Scheduling Algorithm • Select evaluation criteria • We’ve seen five – could be any one of them • Could be a balance of several of them • Max CPU utilisation with min response time • Max throughput while keeping turnaround time linearly proportional to execution time • Pick a model to evaluate the algorithms

Algorithm Evaluation • Deterministic modeling • Takes a fixed predetermined workload and evaluates the performance of each algorithm for that workload • Requires exact values, and solution only applies to those or are general trends • Queueing models • Knowing distribution of CPU & I/O bursts, and the arrival rate & service rate of CPU ready queue and I/O device queues, we can perform a mathematical queueing-network analysis • Limited to a few well-studied distributions and algorithms • Simulation • Build computer simulation, generate processes with given probability distribution, and test algorithms • Might be inaccurate, can be very slow and expensive • Implementation • Actually implement OSes with various scheduling systems and get real-world results • Very expensive, user-sensitive

Review • Given this schedule, compute the turnaround time, response time and waiting time of • First-Come-First-Serve • Non-preemptive shortest-job-first • Shortest-remaining-time-first • Round-Robin with time quantum of 4 Process Arrival Time Burst Time P1 0.0 40 P2 0.0 18 P3 5.0 6 P4 8.0 30 P5 20.0 15

Exercises • Read sections 6.1 to 6.5 and 6.7 to 6.8 • If you have the “with Java” textbook, skip the Java sections and subtract 1 to the following section numbers • 6.2 • 6.3 • 6.4 • 6.5 • 6.6 • 6.9 • 6.10 • 6.11 • 6.12 • 6.14 • 6.15 • 6.16 • 6.17 • 6.19 • 6.21 • 6.24 • 6.25 • 6.27

End of Chapter 6

Efficient CPU Scheduling Strategies for Operating Systems