360 likes | 450 Views
実践システムソフトウェア Practical System Software. 石川 裕 (Yutaka Ishikawa) 情報理工学系研究科コンピュータ科学専攻. Schedule. Oct. 6 Introduction Oct. 13 Linux and Windows Basic Architecture Oct. 20 Lab Experiment 1 Setting up and Introducing the WRK environment Oct. 27 Scheduler Nov. 10 Lab Experiment 2
E N D
実践システムソフトウェアPracticalSystem Software 石川 裕 (Yutaka Ishikawa) 情報理工学系研究科コンピュータ科学専攻 part 2
Schedule • Oct. 6 Introduction • Oct. 13 Linux and Windows Basic Architecture • Oct. 20 Lab Experiment 1 • Setting up and Introducing the WRK environment • Oct. 27 Scheduler • Nov. 10 Lab Experiment 2 • Adding a new system call for counting some kernel events (context switch, page fault, process creation/deletion, etc..) • Assignment 1 • Nov. 17 No class • Nov. 24 Synchronizations • Dec. 1 Lab Experiment 3 • User space synchronization and priority inversion • Assignment 2 • Dec. 8 Virtual Memory • Dec. 15 Lab Experiment 4 • Experiment on working set • Assignment 3 • Dec. 22 I/O & File System • Jan. 12 Lab Experiment 5 • Final Assignment • Jan. 19 Lab Experiment 6 • Jan. 26 Lab Experiment 7 Grade: based on the number of attendances and the evaluation of four reports part 2
Outline • Windows NT OS architecture • GNU/Linux OS architecture • Comparing the Architectures • Windows NT kernel components • Linux kernel components • Processes and threads in Windows • Processes and threads in Linux • Comparing process representations Note: The material includes part of the Windows Operating System Internals Curriculum Resource Kit, developed by David A. Solomon and Mark E. Russinovich with Andreas Polze part 2
Windows OS architecture Environment Subsystems System SupportProcesses Service processes User applications Windows OS/2 POSIX Windows DLLs Subsystem DLLs User space Kernel space Executive Windows User/GDI Device Driver Device Drivers Kernel Hardware Abstraction Layer (HAL) part 2
Multiple OS Personalities • Windows was designed to support multiple “personalities”, called environment subsystems • Programming interface • File system syntax • Process semantics • Environment subsystems provide exposed, documented interface between application and Windows native API • Each subsystem defines a different set of APIs and semantics • Subsystems implement these by invoking native APIs • Example: Windows CreateFile in Kernel32.Dll calls native NtCreateFile • .exes and .dlls you write are associated with a subsystem • Specified by LINK /SUBSYSTEM option • Cannot mix calls between subsystems
Environment Subsystems • Three environment subsystems originally provided with NT: • Windows –Windows API (originally 32-bit, now also 64-bit) • OS/2 - 1.x character-mode apps only • Removed in Windows 2000 • Posix - only Posix 1003.1 (bare minimum Unix services - no networking, windowing, threads, etc.) • Removed in XP/Server 2003 – enhanced version ships with Services For Unix 3.0 • Of the three, the Windows subsystem provides access to the majority of native OS functions • Of the three, Windows is required to be running • System crashes if Windows subsystem process exits • POSIX and OS/2 subsystems are actually Windows applications • POSIX & OS/2 start on demand (first time an app is run) • Stay running until system shutdown
GNU / Linux OS architecture Xorg X-Window System /sbin/init Daemon services User applications GNU C Library (glibc) & other libraries User space Kernel space Kernel Architecture Dependent Kernel Code part 2
Comparing the Architectures • Both Linux and Windows are monolithic • All core operating system services run in a shared address space in kernel-mode • All core operating system services are part of a single module • Linux: vmlinuz • Windows: ntoskrnl.exe • Windowing is handled differently: • Windows has a kernel-mode Windowing subsystem • Linux has a user-mode X-Windowing system part 2
Windows Application User Mode Kernel Mode Application Linux System Services Process Management, Memory Management, I/O Management, etc. Device Drivers X-Windows Win32 Windowing User Mode Kernel Mode Hardware Dependent Code System Services Process Management, Memory Management, I/O Management, etc. Device Drivers Hardware Dependent Code Comparing the Architectures part 2 9
Windows Kernel • Windows is a monolithic but modular system • No protection among pieces of kernel code and drivers • Support for Modularity is somewhat weak: • Windows Drivers allow for dynamic extension of kernel functionality • Windows XP Embedded has special tools / packaging rules that allow coarse-grained configuration of the OS • Windows Drivers are dynamically loadable kernel modules • Significant amount of code run as drivers (including network stacks such as TCP/IP and many services) • Built independently from the kernel • Can be loaded on-demand • Dependencies among drivers can be specified part 2 10
Windows Kernel Components System Service Dispatcher I/O Mgr File System Cache Object Mgr. Plug and Play Mgr. Power Mgr. SecurityReferenceMonitor VirtualMemory Processes& Threads Configura- tion Mgr (registry) Local Procedure Call Windows graphics Device & File Sys. Drivers Graphics Drivers Kernel Hardware Abstraction Layer (HAL) part 2
Linux Kernel • Linux is a monolithic but modular system • All kernel subsystems form a single piece of code with no protection between them • Modularity is supported in two ways: • Compile-time options • Most kernel components can be built as a dynamically loadable kernel module (DLKM) • DLKMs • Built separately from the main kernel • Loaded into the kernel at runtime and on demand (infrequently used components take up kernel memory only when needed) • Kernel modules can be upgraded incrementally • Support for minimal kernels that automatically adapt to the machine and load only those kernel components that are used part 2 12
Linux Kernel Components System Call Interface (SCI) Virtual File System Device Drivers Buffer & Page Cache Networking Subsysetm Virtual Memory (MM) Process Management (PM) Interprocess Communication File Sys. Drivers Architecture Specific Kernel Code part 2
Comparing Layering, APIs, Complexity • Windows • Kernel exports about 250 system calls (accessed via ntdll.dll) • Layered Windows/POSIX subsystems • Rich Windows API (17 500 functions on top of native APIs) • Linux • Kernel supports about 200 different system calls • Layered BSD, Unix Sys V, POSIX shared system libraries • Compact APIs (1742 functions in Single Unix Specification Version 3; not including X Window APIs) part 2
Windows vs. Linux kernel complexity • Measured via a Web server scenario, serving the same picture, call graph provided • http://www.visualcomplexity.com/vc/project.cfm?id=392 • c.f. • Keith Curtis, “After the Software Wars,” http://www.lulu.com/product/download/after-the-software-wars/5222163 part 2
Windows + IIS web serversystem call complexity Source: http://www.visualcomplexity.com/vc/project.cfm?id=392 part 2
Linux + Apache web server system call complexity Source: http://www.visualcomplexity.com/vc/project.cfm?id=392 part 2
Per-process address space Thread Thread Thread Systemwide Address Space Programs, Processes and Threads • What is a program? • A sequence of instructions that can be executed • What is a process? • Represents an instance of a running program • you create a process to run a program • starting an application creates a process • Process defined by: • Address space • Resources (e.g. files, etc..) • What is a thread? • An execution context within a process • Unit of scheduling (threads run, processes don’t run) • All threads in a process share the same per-process address space • Services provided so that threads can synchronize access to shared resources (critical sections, mutexes, events, semaphores) part 2 18
Thread Thread Thread Thread Single and Multithreaded Processes code data files code data files registers stack registers registers registers stack stack stack single-threaded multi-threaded part 2 19
Windows Process and Thread Representation • Data Structures for each process/thread: • Executive process block (EPROCESS) • Executive thread block (ETHREAD) • Win32 process block • Process environment block • Thread environment block Process environment block Thread environment block Process address space System address space Process block (EPROCESS) Win32 process block Handle table Thread block (ETHREAD) ... part 2 20
Process • Container for an address space and threads • Associated User-mode Process Environment Block (PEB) • Primary Access Token • Quota, Debug port, Handle Table etc • Unique process ID • Queued to the Job, global process list and Session list • MM structures like the WorkingSet, VAD tree, etc part 2
Process Block Layout (EPROCESS & KPROCESS) Kernel Process Block (or PCB) Process ID Dispatcher Header Parent Process ID Exit Status Process Page Directory Kernel Time Create and Exit Time User Time EPROCESS Next Process Block Inwwap/Outswap List Entry Quota Block . . . KTHREAD Process Spin Lock Memory Management Information Processor Affinity Exception Port Resident Kernel Stack Count Debugger Port Process Base Priority Primary Access Token Default Thread Quantum Process State Handle Table Thread Seed Process Environment Block Disable Boost Flag Image File Name Image Base Address Process Priority Class Win32 Process Block part 2
Objects and Handles • Many Windows APIs take arguments that are handles to system-defined data structures, or “objects” • App calls CreateXxx, which creates an object and returns a handle to it • App then uses the handle value in API calls that operate on that object • Three types of Windows objects (and therefore handles): • Windows “kernel objects” (events, mutexes, files, processes, threads, etc.) • Objects are managed by the Windows “Object Manager”, and represent data structures in system address space • Handle values are private to each process • Windows “GDI objects” (pens, brushes, fonts, etc.) • Objects are managed by the Windows subsystem • Handle values are valid system-wide / session-wide • Windows “User objects” (windows, menus, etc.) • Objects are managed by the Windows subsystem • Handle values are valid system-wide / session-wide
Handles and Security • Process handle table • Is unique for each process • But is in system address space, hence cannot be modified from user mode • Hence, is trusted • Security checks are made when handle table entry is created • i.e. at CreateXxx time • Handle table entry indicates the “validated” access rights to the object • Read, Write, Delete, Terminate, etc. • APIs that take an “already-opened” handle look in the handle table entry before performing the function • For example: TerminateProcess checks to see if the handle was opened for Terminate access • No need to check file ACL, process or thread access token, etc., on every write request---checking is done at file handle creation, i.e. “file open”, time
Thread • Fundamental schedulable entity in the system • Represented by ETHREAD that includes a KTHREAD • Queued to the process (both E and K thread) • Unique thread ID • Associated User-mode Thread Environment Block (TEB) • User-mode stack • Kernel-mode stack • Processor Control Block (in KTHREAD) for CPU state when not running part 2
ETHREAD KTHREAD Create and Exit Time Process ID EPROCESS Thread Start Address Access Token Impersonation Information LPC Message Information Timer Information Pending I/O Requests TEB Thread Block (ETHREAD & KTHREAD) KTHREAD Dispatcher Header Total User Time Total Kernel Time Kernel Stack Information System Service Table Thread Scheduling Information Trap Frame Thread Local Storage Synchronization Information List of Pending APCs Timer Block and Wait Blocks List of Objects Being Waiting On part 2
Windows Process & ThreadsInternal Data Structures VAD VAD VAD Process Object Virtual Address Space Descriptors Handle Table object object Thread Thread Thread . . . part 2
Linux Process and Thread Representation = Task • Data Structure for each thread: • Central structure: task_struct • No separate process and thread structures • Process notion is expressed via shared structures among multiple tasks System address space Memory management Memory management File descriptor table File descriptor table Task (task_struct) Task (task_struct) ... ... part 2 28
Task structure • Virtual memory address space • State and execution information • Process ID, thread group ID • Pointers to parent, children • File descriptor table • Signal handlers and pending signals • Credentials (uid, gid) • IPC information part 2
Linux task_struct details (1/2) Task state Stack pointer Process flags PTrace info Priority information task_struct Task list mm_struct Memory management information Binary format Exit state Process ID Thread group ID task_struct Parent process task_struct task_struct List of children task_struct task_struct List of siblings Thread group leader task_struct Real time priority UID, GID credentials part 2
Linux task_struct details (2/2) Executable name IPC information thread_struct CPU spec. info File descriptor table files_struct Namespaces signal_struct Signal handlers VM state tty_struct Terminal information Memory policy Control group information ... Tracer state flags part 2
Linux TasksInternal Data Structures VMA VMA VMA MM Object Task Object Virtual Memory Area Descriptors File Descriptor Table file Task Object file Task Object . . . part 2
Process Management Comparison • Linux • Process is called a Task • Basic address space, file descriptor table, statistics • Parent/child relationship • Basic scheduling unit • Threads • No threads per-se • Tasks can act like Windows threads by sharing file descriptor table, PID and address space • PThreads – cooperative user-mode threads • Windows • Process • Address space, handle table, statistics and at least one thread • No inherent parent/child relationship • Threads • Basic scheduling unit • Fibers - cooperative user-mode threads part 2 33
Process Lifecycle • Linux • Process creation • fork() - address space same as parent’s • exec() - new process image gets loaded • Process termination, two steps: • Finishes execution or receives SIGTERM / SIGKILL • Parent is supposed to call waitpid() - for learning exit status of the child • Zombie processes • Re-parenting to init • Windows • Process creation • CreateProcess() • Address space filled with ntdll and process image • Process termination • ExitProcess() / TerminateProcess() • Partially destroyed on last thread exit • Totally destroyed on last derefernce part 2 34
Asynchronous Procedure Calls (APC) and signals • Windows • APC • For executing a procedure in a process’ context • Kernel / User mode APC • POSIX subsystem uses kernel mode APCs to emulate UNIX signals • User mode APC is only delivered when the thread is in an alertable wait state • QueueUserAPC() • Linux • Signals • Asynchronous notification for processes • Handler routines for processing signals (only user-space) • Signal queue processing initiated during switches from kernel to user-mode • Can be blocked, but not SIGKILL part 2 35
Process groups, Sessions, Jobs • Linux • Process groups • Independent processes can be combined into a group • setpgrp() • Sending signals to the whole group • Note: processes connected with pipes are in the same process group • Sessions • Process groups can be combined into sessions • Login sessions • setsid() • Windows • Jobs • Processes can be assigned to process groups • CreateJobObject() / AssignProcessToJobObject() • Maximum number of processes • Job-wide user mode CPU time limit • Job scheduling/priority class • Terminate all processes part 2 36