Internet Services Administration CS35910

Internet Services AdministrationCS35910 System Startup/Shutdown and Process Management 3 lectures

How An Operating System Loads Sidney Harris

Bootstrapping How to go from nothing to something… xkcd … or something to nothing

Hard Disk Drives

Hard Disk Layout

Partitioning • Why partition a drive? • Separate O.S. from data (Principle 12: Separation) • Make better use of disk space • Ease of upgrades • Improved performance (less metadata) • Disk size limitations • Avoid filesystem fragmentation • Problems with partitioning • How big should the partition be? • What letter should we assign CDROM etc.?

PC Disk Partitions Boot Code 1 Partition Table 2 3 4 Extended Partition MBR Partitions within extended partition Partition 1 Partition 2 Partition 4 MBR Boot Sectors

NT From PC BIOS Then a miracle occurs

Extensible Firmware interface M.Sikma, Creative Commons BY-SA 2.5

Supported in Windows, MacOS and Linux • Requires EFI firmware in Windows/MacOS • Supports disks up to 9.4 ZB (9.4 × 1021 bytes) • 128+ partitions GUID Partition Table • Negative LBAs count down from end of volume Protective MBR Primary GPT Header Entry 1 Entry 2 Entry 3 Entry 4 Entries 5-128 Partition 1 Partition 2 More partitions Entry 1 Entry 2 Entry 3 Entry 4 Entries 5-128 Secondary GPT Header

Minimum Tasks Necessary • 32 bit mode after 10 instructions • Open Source – Linux Based • Extensible • Multiple Payloads

Cluster Sizes • PC disks use 512 byte sectors • NTFS allows a cluster size to be specified • Always a multiple of 512 bytes • Cluster sizes up to 64K supported, but only up to 4K allow compression Clusters 1 2 3 4 5 6 7 8 9 Files

Small cluster size wastes less space Especially when many small files will be stored Large cluster size reduces metadata Data used to keep track of data on the filesystem For improved performance, create a partition for swap file and use format /a:64K /fs:ntfs Filesystem Cluster Size

Which Filesystem?

UNIX Filesystems • EXT2 – a non journaling Unix (Linux) filesystem • Index nodes (inodes) point to disk data blocks • Adjustable blocks per inode (cluster size) • May adjust number of inodes • Only when filesystem is created • EXT3 adds a journal to EXT2, but at a performance penalty • Many others • Linus Torvalds “HFS+ is complete and utter …” • File and directory catalog is in a single data structure • C.f. Windows registry

Delayed Write Advantages Disadvantages System failure can cause file loss Databases and processes using shared files needmust avoid caching Processes need not be blocked for slow I/O operations Many writes never make it to disk at all More data can be handled at once

Delayed Allocation (Allocate on Flush) • On write, subtract space from free space counter • Allocate actual blocks only on flush • Reduces filesystem fragmentation

Filesystem Checking • FSCK on Unix/Linux, CHKDSK/Scandisk on Windows • Will check for file consistency, orphaned files, incorrect information in inodes/filesystem table and attempt to fix these • Journal led filesystems track changes more carefully, and reduce the work of fsck • After fsck is run on Linux, halt –n is called to prevent a sync. This might otherwise overwrite the fsck changes, putting the filesystem back to where it was before fsck was run.

Journaled Filesystems • Multiple blocks updated in filesystem write (data + metadata) • Active data often stays in system cache for performance reasons (delayed write) • Filesystems cannot track what is being updated at time of system failure, so all blocks must be checked for consistency • Journals log all block updates before they happen to round robin location or file • Journals may log metadata (filesystem structure) or data and metadata (everything) • After system failure the log is “replayed,” and the filesystem is brought into a consistent state

Journaled Filesystems • An area of the disk is set aside for all journaling activities ($Logfile in NTFS) • Atomic Operation: • Header written to the log • Block updates written to the log • Header updated to reflect completion of block updates (“completed”) • Later the blocks are copied from the journal and the operation is marked “committed”

Journal Checking and Rollback • When the system restarts the file system check utility reads the journal log • Actions not marked completed are ignored • Actions marked completed but not committed have all associated blocks copied out to the disk again • Actions marked completed and committed are ignored – they are completed and consistent

Loading UNIX • Boot manager starts kernel • E.g. LILO starts vmlinux on /boot partition • First process (init) is started • Init checks filesystem integrity before mounting filesystems in read/write mode • Paging areas designated • Filesystem cleanup, preservation of editor buffers etc. • Daemons started for major subsystems • Networking code and daemons started, remote disk space mounted • Remove /etc/nologin and start getty

Unix Single User Mode • Designed for administrative maintenance • No networking code loaded • Getty not started • Single user password provides security • Allows hardware maintenance and debugging of problematic services • Only root filesystem is mounted by default

UNIX Run Levels Typical Values – No Standard

Process Management

Init – the parent of all processes Processes created by fork Image replaced by exec Known as “fork and exec” Inherits environment of parent The Lifecycle of a Process

Process Cleanup • When a process finishes, it calls _exit to notify kernel it is ready to die. An exit code is supplied • Parent process must acknowledge the death of a child before the process can die (using a call to wait). • Parent receives exit code and may access summary of child resources • If parent dies before the child, or does not call wait the process will become orphaned • Init adopts these orphaned processes and performs the wait for them

Linux Idiosyncrasies • Several spontaneous processes created – not created by fork and exec of init • Number and nature vary but usually include • kpiod • kswapd • kflushd • kupdated • Etc. • These are really portions of the kernel made to look like fully fledged processes, but are not truly so.

Daemons and Services • Daemons • background processes that perform a specific function or system related task • Coined by Mick Bailey, who cited the O.E.D definition for the older form, Daemon, as “an attendant spirit that influences one’s character or personality” – neither good nor evil! • Often called “services”, e.g. in NT/XP • Often started at boot time • XP services can start at various stages of boot or can be manually started • Unix services started by init scripts at each run level • inetd/xinetd are special daemons that can spawn other daemons on demand

Principle 41 • Separate UIDs for Services • Each service which does not require privileged access to the system should be given a separate non-privileged user-ID. This restricts service privileges, preventing any potential abuse should the service be hijacked by system attackers; it also makes clear which service is responsible for which process in the process table.

Periodic Processes • Some tasks must be run at specific times • E.g. backups tend to run at night when the network is quiet • Unix/Linux solutions • crond – regular scheduled commands • atd – run a command at a particular time • NT/XP • winat – Runs programs at specific times, and can do so on a regular schedule • Security context of winat commands can be problematic

Requirement for Orderly Startup and Shutdown • Disk write buffers must be flushed • TCP connections may be left half open • Application level programs should be shut down neatly in case of poor crash handling

Techniques for Startup and Shutdown • Windows NT/XP • Start/Shutdown • Shutdown resource kit tool • E.g shutdown –t 120 –m \\ws-123 –c “System Maintenance” • Linux • Telinit • Move to init state • Init 0 usually runs halt • Halt • At other run levels does the same as telinit 0 • Shutdown (Recommended) • Interface that allows network messages • E.g shutdown –t 120 “System Maintenance”

A Different Philosophy • NT/XP are derived from consumer oriented operating systems • Many problems may immediately be solved with a reboot • Installations and service packs require reboots before new system files can be loaded • UNIX systems are designed for high availability • Problems often too subtle to be fixed with a reboot – and a reboot can occasionally be disastrous • Services are decoupled from the kernel and may be stopped, upgraded and restarted without a reboot

Principle 24 (Reliability) • Any model of system infrastructure must have reliability as one of its chief goals. Down-time can often be measured in real money • Principles of Network and System Administration

Processes accept signals. It is possible to send any of these signals with the kill command, as long as the user has ownership of the process or is acting as the super user man 7 signal to find all signals supported on a system. E.g.: Process Management in Linux

Process Management in Linux

Niceness • Processes are prioritised and scheduled through their priority value which can be affected by a nice value. • Most user processes have a nice value of 0 by default • A user may nice the value of a system intensive process by using the nice and renice commands. Values up to 19 are supported • Negative nice values are used to increase a process priority, up to -20 • Only a superuser can specify a negative nice value • Avoid nice values below -10, as this tends to give the process higher priority than important system processes in some versions of UNIX • NTP uses a very low nice value to avoid any system delays

Principle 4 (Communities) • What one member of a cooperative community does affects every other member and vice versa. Each member of the community therefore has a responsibility to consider the effect of his/her actions on all the other users • Principles of Network and System Administration

Process Management in NT/XP

Permissions and Privileges • Permissions (rights) always associated with a particular object • Permission to read a file etc. • Privileges associated with particular actions on the system and granted to users • E.g. SE_SYSTEMTIME_NAME privilege to change system time

Runaway processes • May be user processes or system processes • Identified by excessive CPU usage • Often best to stop them quickly with kill –STOP in Linux. A kill –CONT can allow the process to continue later • Cause of problem may then be removed (e.g. filesystem that is full) • NT/XP usually forces you to kill the process or its associated application

Memory Leaks • Common in code written in C where there is no garbage collection • Steadily consumes all the systems memory • On a windows system a crashed application often continues to consume memory – a reboot will clear the problem • May be identified and handled on Linux • top is a useful process for discovering memory leaks and CPU hogs

Windows v Unix • Perceptions: • Windows has more bugs/is more vulnerable to viruses? • Unix simpler to manage? • Microsoft is untrustworthy? • Windows is slow? • Windows Microkernel is better than Unix monolithic kernel?

Internet Services Administration CS35910