400 likes | 409 Views
Explore scheduling algorithms for real-time systems with deadlines, multilevel queues, and feedback queues. Learn about thread scheduling, kernel-level threads, multiprocessor scheduling, NUMA, and Solaris scheduling.
E N D
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11:30 - 12:30 AM
Last time: CPU Scheduling Today: CPU scheduling Process synchronization Next time Process synchronization Reading assignments Chapter 6 of the textbook Lecture 25 – Friday, March 15, 2013 Lecture 25
Scheduling algorithms for real-time systems Activities in a real-time (RT) systems are subject to deadlines. If a thread T does not finishes execution by its deadline then the system fails. Earliest deadline first (EDF) is a scheduling algorithm for RT the tread/process with the earliest deadline is scheduled first. Lecture 25
Multilevel queues • Ready queue is partitioned into separate queues: 1. foreground (interactive) 2. background (batch) • Each queue has its own scheduling algorithm • foreground – RR • background – FCFS • The CPU must be shared among the queues • Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation. • Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., • 80% to foreground; RR scheduling algorithm • 20% to background; FCFS scheduling algorithm Lecture 25
Individual queues for different types of processes Lecture 25
Multilevel feedback queue • Multiple queues are defined, each one with its own scheduling strategy and time quantum. • As the process ages it moves amongst the queues. • Parameters of the multilevel-feedback-queue scheduler: • number of queues • scheduling algorithms for each queue • method used to determine when to upgrade a process • method used to determine when to demote a process • method used to determine which queue a process will enter when that process needs service Lecture 25
Three queues: • Q0 – RRwith time quantum 8 milliseconds • Q1– RR with time quantum 16 milliseconds • Q2 – FCFS • Scheduling • A new job enters queue Q0which is servedFCFS. When it gains CPU, job receives 8 milliseconds. If it does not finish in 8 milliseconds, job is moved to queue Q1. • At Q1 job is again served FCFS and receives 16 additional milliseconds. If it still does not complete, it is preempted and moved to queue Q2. Lecture 25
Thread scheduling Distinction between user-level and kernel-level threads. Many-to-one and many-to-many models, thread library schedules user-level threads to run using the LWP (Light Weight Process) model. User-level the threads compete with one another for the time allocated to the process process-contention scope (PCS). Kernel-level threads compete with all the threads in the system system-contention scope (SCS). Lecture 25
Scheduling for multiprocessor/multicore • Scheduling more complex when multiple CPUs/cores are available. • Homogeneous processors within a multiprocessor. • NUMA – Non-Uniform Memory Access. • Multiproccessing: • Symmetric multiprocessing (SMP) – each processor is self-scheduling, all processes in common ready queue, or each has its own private queue of ready processes. • Asymmetric multiprocessing – only one processor accesses the system data structures, alleviating the need for data sharing • Processor affinity – a process has affinity for the processor on which it is currently running • soft affinity • hard affinity Lecture 25
NUMA and CPU Scheduling Lecture 25
Multicore processors • Faster and consume less power • Multiple threads per core;takes advantage of memory stall to make progress on another thread while memory retrieve happens Lecture 25
Solaris scheduling • The Solaris 10 kernel threads model consists of the following objects: • kernel threads This is what is scheduled/executed on a processor • user threads The user-level thread state within a process. • processThe object that tracks the execution environment of a program. • lightweight process (lwp) Execution context for a user thread. Associates a user thread with a kernel thread. • Fair Share Scheduler (FSS) allows more flexible process priority management. • Each project is allocated a certain number of CPU shares via the project.cpu-shares resource control. • Each project is allocated CPU time based on its cpu-shares value divided by the sum of the cpu-shares values for all active projects. • Anything with a zero cpu-shares value will not be granted CPU time until all projects with non-zero cpu-shares are done with the CPU. Lecture 25
Solaris scheduling classes TS (timeshare):default class for processes and their associated kernel threads. Priority range 0-59; dynamically adjusted (vary during the lifetime of a process to allocate processor resources evenly. IA (interactive):enhanced version of the TS class that applies to the in-focus window in the GUI. Its intent is to give extra resources to processes associated with that specific window. Like TS, IA's range is 0-59. FSS (fair-share scheduler): Share-based rather than priority- based. Threads scheduled based on their associated shares and the processor's utilization. FSS also has a range 0-59. FX (fixed-priority): The priorities for threads associated with this class are fixed, do not vary dynamically over the lifetime of the thread. Range 0-59. SYS (system): Used to schedule kernel threads. Threads in this class are "bound" threads, which means that they run until they block or complete. Priorities for SYS threads are in the 60-99 range. RT (real-time): Threads in the RT class are fixed-priority, with a fixed time quantum. Their priorities range 100-159, so an RT thread will preempt a system thread. Lecture 25
Linux scheduling • Two priority ranges: • time-sharing • real-time • Nice is used Unix and Unix-like operating systems e.g., asLinux • Invokes a utilityor shell script with a particular priorityand gives a process more or less CPU time than other processes. • −20 is the highest priority and 20 is the lowest priority. • Default niceness for processes is inherited from its parent process, usually 0. • Real-time range from 0 to 99 and nice value from 100 to 140. Lecture 25
Evaluation of scheduling algorithms • Deterministic modeling –defines the performance of each scheduling algorithm for a particular type of workload • Simulation– using a set of trace data. Data obtained from past execution of a certain type of workload. • Benchmarks –sets of programs to test hardware performance, scheduling algorithms, other types of software. Lecture 25
Chapter 6 - Process synchronization • The need for concurrency control: • Concurrent access to shared data may result in data inconsistency. • Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes. • The producer-consumer problem. Lecture 25
Critical concepts for thread coordination • Critical section code that accesses a shared resource. • Race conditions two or more threads access shared data and the result depends on the order in which the threads access the shared data. • Mutual exclusion only one thread should execute a critical section at any one time. • Lock shared variable which acts as a flag to coordinate access to shared data. • Spin lock a thread keeps checking a control variable/semaphore “until the light turns green” • Side effects of thread coordination • Deadlock • Priority inversion a lower priority activity is allowed to run before one with a higher priority Lecture 18
Thread coordination with a bounded buffer • Producer-consumer problem two threads cooperate – the producer is writing in a buffer and the consumer is reading from the buffer. • Basic assumptions: • We have only two threads • Threads proceed concurrently at independent speeds/rates • Bounded buffer – only N buffer cells • Messages are of fixed size and occupy only one buffer cell. Lecture 14
Locks; Before-or-After actions • Locks shared variables which acts as a flag to coordinate access to a shared data. Manipulated with two primitives • ACQUIRE • RELEASE • Support implementation of Before-or-After actions; only one thread can acquire the lock, the others have to wait. • All threads must obey the convention regarding the locks. • The two operations ACQUIRE and RELEASE must be atomic. • Hardware support for implementation of locks • RSM – Read and Set Memory • CMP –Compare and Swap • RSM (mem) • If mem=LOCKED then RSM returns r=LOCKED and sets mem=LOCKED • If mem=UNLOCKED the RSM returns r=LOCKED and sets mem=LOCKED Lecture 18
Deadlocks • Happen quite often in real life and the proposed solutions are not always logical: “When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone.” a pearl from Kansas legislation. • Deadlock jury. • Deadlock legislative body. Lecture 18
Examples of deadlock • Traffic only in one direction. • Solution one car backs up(preempt resources and rollback). Several cars may have to be backed up . • Starvation is possible. Lecture 18
Thread deadlock • Deadlocks prevent sets of concurrent threads/processes from completing their tasks. • How does a deadlock occur a set of blocked threads each holding a resource and waiting to acquire a resource held by another thread in the set. • Example • locks A and B, initialized to 1 P0P1 wait (A); wait(B) wait (B); wait(A) • Aim prevent or avoid deadlocks Lecture 18
System model • Resource types R1, R2, . . ., Rm (CPU cycles, memory space, I/O devices) • Each resource type Ri has Wi instances. • Resource access model: • request • use • release Lecture 18
Simultaneous conditions for deadlock • Mutual exclusion: only one process at a time can use a resource. • Hold and wait: a process holding at least one resource is waiting to acquire additional resources held by other processes. • No preemption: a resource can be released only voluntarily by the process holding it (presumably after that process has finished). • Circular wait: there exists a set {P0, P1, …, P0} of waiting processes such that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by P2, …, Pn–1 is waiting for a resource that is held by Pn, and P0 is waiting for a resource that is held by P0. Lecture 18
Wait for graphs Lecture 18
Semaphores • Abstract data structure introduced by Dijkstra to reduce complexity of threads coordination; has two components • C count giving the status of the contention for the resource guarded by s • L list of threads waiting for the semaphore s • Counting semaphore – for an arbitrary resource count. Supports two operations: V - signal() increments the semaphore C P - wait() P decrements the semaphore C. • Binary semaphore: C is either 0 or 1. Lecture 18
The wait and signal operations P (s) (wait) { If s.C > 0 then s.C − −; else join s.L; } V (s) (signal) { If s.L is empty then s.C + +; else release a process from s.L; } Lecture 18
Monitors • Semaphores can be used incorrectly • multiple threads may be allowed to enter the critical section guarded by the semaphore • may cause deadlocks • Threads may access the shared data directly without checking the semaphore. • Solution encapsulate shared data with access methods to operate on them. • Monitors an abstract data type that allows access to shared data with specific methods that guarantee mutual exclusion Lecture 18
Asynchronous events and signals • Signals, or software interrupts, were originally introduced in Unix to notify a process about the occurrence of a particular event in the system. • Signals are analogous to hardware I/O interrupts: • When a signal arrives, control will abruptly switch to the signal handler. • When the handler is finished and returns, control goes back to where it came from • After receiving a signal, the receiver reacts to it in a well-defined manner. That is, a process can tell the system (OS) what they want to do when signal arrives: • Ignore it. • Catch it and deliver it. In this case, it must specify (register) the signal handling procedure. This procedure resides in the user space. The kernel will make a call to this procedure during the signal handling and control returns to kernel after it is done. • Kill the process (default for most signals). • Examples: Event - child exit, signal - to parent. Control signal from keyboard. Lecture 18
Implicit assumptions for the correctness of the implementation • One sending and one receiving thread. Only one thread updates each shared variable. • Sender and receiver threads run on different processors to allow spin locks • in and out are implemented as integers large enough so that they do not overflow (e.g., 64 bit integers) • The shared memory used for the buffer provides read/write coherence • The memory provides before-or-after atomicity for the shared variables in and out • The result of executing a statement becomes visible to all threads in program order. No compiler optimization supported Lecture 14
Solutions to thread coordination problems must satisfy a set of conditions • Safety: The required condition will never be violated. • Liveness: The system should eventually progress irrespective of contention. • Freedom From Starvation: No process should be denied progress for ever. That is, every process should make progress in a finite time. • Bounded Wait: Every process is assured of not more than a fixed number of overtakes by other processes in the system before it makes progress. • Fairness: dependent on the scheduling algorithm • • FIFO: No process will ever overtake another process. • • LRU: The process which received the service least recently gets the service next. • For example for the mutual exclusion problem the solution should guarantee that: • Safety the mutual exclusion property is never violated • Liveness a thread will access the shared resource in a finite time • Freedom for starvation a thread will access the shared resource in a finite time • Bounded wait a thread will access the shared resource at least after a fixed number of accesses by other threads. Lecture 18
Thread coordination problems Dining philosophers Critical section Lecture 18
A solution to critical section problem • Applies only to two threads Ti and Tjwith i,j ={0,1} which share • integer turn if turn=ithen it is the turn of Ti to enter the critical section • boolean flag[2] if flag[i]= TRUE then Ti is ready to enter the critical section • To enter the critical section thread Ti • sets flag[i]= TRUE • sets turn=j • If both threads want to enter then turn will end up with a value of either i or j and the corresponding thread will enter the critical section. • Ti enters the critical section only if either flag[j]= FALSE or turn=i • The solution is correct • Mutual exclusion is guaranteed • The liveliness is ensured • The bounded-waiting is met • But this solution may not work as load and store instructions can be interrupted on modern computer architectures Lecture 18
Signals state and implementation • A signal has the following states: • Signal send - A process can send signal to one of its group member process (parent, sibling, children, and further descendants). • Signal delivered - Signal bit is set. • Pending signal - delivered but not yet received (action has not been taken). • Signal lost - either ignored or overwritten. • Implementation: Each process has a kernel space (created by default) called signal descriptor having bits for each signal. Setting a bit is delivering the signal, and resetting the bit is to indicate that the signal is received. A signal could be blocked/ignored. This requires an additional bit for each signal. Most signals are system controlled signals. Lecture 18