1.22k likes | 1.37k Views
Chapter 3, Processes. 3.1 Process Concep t. The process is the unit of work in a system. Both user and system work is divided into individual jobs, or processes.
E N D
3.1 Process Concept • The process is the unit of work in a system. Both user and system work is divided into individual jobs, or processes. • As already defined, a process is a program in execution, or a program that has been given a footprint in memory and can be scheduled to run.
Recall what multi-processing means • The importance of processes may not be immediately apparent because of terminology and also because of the progress of technology. • Keep in mind that multi-processing refers to multiple physical processors. • Also, most recent general purpose computer chips are in fact multi-core, which means that at the physical level, they are multi-processor systems on a single chip.
Why processes are important • The importance of processes stems from the fact that all modern, general purpose systems are multi-tasking. • For the purposes of clarity, in this course the main topic is multi-tasking on a single physical processor. • The point is this: • In a multi-tasking system, each individual task exists as a process.
Defining an O/S by means of processes • Chapters 1 and 2 concerned trying to define an operating system. • Given the fundamental nature of processes, another possible definition presents itself: • The operating system is that collection of processes which manages and coordinates all of the processes on a machine, both operating system processes and user processes.
From the point of view of making the system run, the fact that the operating system is able to manage itself is fundamental. • From the point of view of getting any useful work done, the fact that the operating system manages user processes is fundamental.
Why we are considering multi-tasking on one processor rather than multi-processing • One final note on the big picture before going on: • Managing processes requires managing memory and secondary storage, but it will become clear soon that getting work done means scheduling processes on the CPU. • As mentioned, we are restricting our attention to scheduling multiple processes, one after the other, on a single physical processor.
In multiple core systems, some of the problems of scheduling multiple jobs concurrently on more than one processor would be handled in microcode on the hardware. • However, the operating system for such a system would have to be “multiple-core” aware.
This is a way of saying that modern operating systems are complex because they are multi-processor operating systems. • The point is that you can’t begin to address the complexities of multi-processing until you’ve examined and come to an understanding of operating system functions in a uni-processing environment.
What is a Process? • A process is a running or runnable program. • It has the six aspects listed on the next overhead. • In other words, a process is in a sense defined by a certain set of data values, and by certain resources which have been allocated to it. • At various times in the life of a process, the values representing these characteristics may be stored for future reference, or the process may be in active possession of them, using them.
Text section = the program code • Program counter = instruction pointer = address or id of the current/next instruction • Register contents = current state of the machine • Process stack = method parameters, return addresses, local variables, etc. • Data section = global variables • Heap = dynamically allocated memory
The term state has two meanings • Machine state = current contents of cpu/hardware (registers…) for a given process. • Process state = what scheduling state the O/S has assigned to a process = ready to run, waiting, etc.
Process state refers to the scheduling status of the process • Systems may vary in the exact number and names of scheduling states. • As presented in this course, a straightforward operating system would have the five process (scheduling) states listed on the next overhead.
Process scheduling states • New • Running • Waiting • Ready • Terminated
Process life cycle • A process begins in the new state and ends in the terminated state. • In order to get from one to the other it has to pass through other states. • It may pass through the other states more than one time, cycling through periods when it is scheduled to run and periods when it is not running.
In a classic system, there are six fundamental actions which trigger state transition, which are listed on the following overheads. • The relationship between states and transitions is summarized in the state transition diagram which follows that list.
1. The operating system is responsible for bringing processes in initially. 2. It is also responsible for bringing jobs to an end, whether they completed successfully or not. 3. Interrupts can be viewed as temporarily ending the running of a given process.
4. Processes are scheduled to run by the operating system 5. Processes relinquish the processor and wait when they issue a system request for I/O from secondary storage which only the O/S can satisfy 6. The successful completion of an I/O request makes the requesting processes eligible to run again.
How does the operating system keep track of processes and states? • In a sense, what the operating system does is manage processes. • Inside the operating system software it is necessary to maintain representations of processes. • In other words, it’s necessary to have data structures which contain the following data: • The definition of the process—its aspects and resources • The process’s state—what state it is in, as managed by the operating system in its scheduling role
What is a process control block? • The Process Control Block (PCB) is the representation of a process in the O/S. • In other words, it is a data structure (like an object) containing fields (instance variables) which define the process and its state. • PCB’s don’t exist in isolation. • They may be stored in linked collections of PCB’s where the collection the PCB is in, and its location in the collection implicitly define the process’s state.
The PCB contains the following 7 pieces of information. • In effect, these 7 pieces consist of technical representations of the 6 items which define a process, plus process state. • Current process state = new, running, waiting, ready, terminated • Program counter value = current/next instruction • CPU general purpose register contents = machine state—saved and restored upon interrupt
4. CPU scheduling info = process priority and pointers to scheduling queues 5. Memory management info = values of base and limit registers 6. Accounting info = job id, user id, time limit, time used, etc. 7. I/O status info = I/O devices allocated to process, open files, etc.
This a graphical representation of a PCB, indicating how it might be linked with others
Threads • In the latest edition of the book this subsection is very short • I continue to give the longer version in these overheads • This serves as an introduction to Chapter 4, Threads
You may already have encountered the term thread in the context of Java programming. • Threads come up in this operating systems course for two reasons: • Threads themselves are a modern operating system concept • This book is based on Java, which implements threads at the programming level
That means it’s possible to work directly with threads in Java. • You can learn the concept without actually working with operating system code
Processes and threads • What has been referred to up to this point as a process can also be called a heavyweight thread. • It is also possible to refer to lightweight threads. • Lightweight threads are what is meant when using the term thread in Java. • Not all systems necessarily support lightweight threads, but the ubiquity of Java means that threads are “everywhere” now
What is a lightweight thread? • The term (lightweight) thread means that >1 execution path can be started through the code of a process (heavyweight thread). • Each lightweight thread will have its own data, but it will share the same code with other lightweight threads
The origin of the terminology and its meaning is illustrated in this picture
There are two vertical threads (the warp in weaving) and six horizontal threads (the woof or weft in weaving) • The horizontal threads represent lines of code in a program • The vertical threads represent two independent paths of execution through the code • The paths of execution have come to be known as threads in computer science
A concrete example: A word processor might have separate threads for character entry, spell checking, etc. • When the user opens a document, a thread becomes active for character entry. • When the user selects the spell checking option in the menu, a separate thread of execution (in a different part of) the same program is started.
These two threads can run concurrently. • They don’t run simultaneously, but the user enters characters so slowly, that it is possible to run spell checking “at the same time”.
The relationship between process scheduling and thread scheduling • In effect, threads are like processes in microcosm. • This accounts for the lightweight/heavyweight thread terminology. • They differ in the fact that processes run different program code while threads share the same program code.
The operating system schedules processes so that they run concurrently. • They do not run simultaneously. • Each process runs for a short span of time. • It then waits while another process runs for a short span of time. • From the user’s (human speed) point of view, multiple processes are running “at the same time”.
An operating system supports threads in a similar way. • The implementation of the JVM on a given system depends on that system’s implementation of threads. • Within each process, threads are run concurrently, just as the processes themselves are run concurrently.
Because this book is oriented towards Java, not C, you can’t write operating system internals in Java. • However, you can write threaded code with a familiar programming language API, rather than having to learn an operating system API. • All of the challenges of correct scheduling exist for Java programs, and the tools for achieving this are built into Java.
You can learn some of the deeper aspects of actual Java programming at the same time that you learn the concepts which they are based on, which come from operating system theory. • This is covered in detail in the following chapter.
3.2 Process Scheduling • Multi-programming (= concurrent batch jobs) objective = maximum CPU utilization—have a process running at all times • Multi-tasking (= interactive time sharing) objective = switch between jobs quickly enough to support multiple users in real time • Process scheduler = the part of the O/S that picks the next job to run
One aspect of scheduling is system driven, not policy driven: Interrupts force a change in what job is running • Aside from handling interrupts as they occur, it is O/S policy, the scheduling algorithm, that determines what job is scheduled • The O/S maintains data structures, including PCB’s, which define current scheduling state • There are privileged machine instructions which the O/S can call in order to switch the context (move one job out and another one in)
Scheduling queues = typically some type of linked list data structure • Job queue = all processes in the system—some may still be in secondary storage—may not have been given a memory footprint yet • Ready queue = processes in main memory that are ready and waiting to execute (not waiting for I/O, etc. • I/O device (wait) queues = processes either in possession of or waiting for I/O device service
Diagram key • Rectangles represent queues • Circles represent resources • Ovals represent events external to the process • Events internal to the process which trigger a transition are simply indicated by the queue that the process ends up in • Upon termination the O/S removes a process’s PCB from all queues and deallocates all resources held
Schedulers • The term scheduler refers to a part of the O/S software • In a monolithic system it may be implemented as a module or routine. • In a non-monolithic system, a scheduler may run as a separate process.
Long term scheduler—this is the scheduler you usually think of second, not first, although it acts first • Picks jobs from secondary storage to enter CPU ready queue • Controls degree of multiprogramming (total # of jobs in system) • Responsible for stability—number of jobs entering should = number of jobs finishing • Responsible for job mix, CPU bound vs. I/O bound • Runs infrequently; can take some time to choose well
Short term scheduler, a.k.a. the CPU scheduler, the scheduler you usually think of first • This module implements the algorithm for picking processes from the ready queue to give the CPU to • This is the heart of interactive multi-tasking • This runs relatively frequently • It has to be fast so you don’t waste CPU time on switching overhead
Medium term scheduler—the one you usually think of last • Allows jobs to be swapped out to secondary storage if multi-programming level is too high • Not all systems have to have long or medium term schedulers • Simple Unix just had a short term scheduler. • The multi-programming level was determined by the number of attached terminals
The relationship between the short, medium, and long term schedulers