270 likes | 368 Views
Forking, w/ Unified VM Wrapup. Vivek Pai / Kai Li Princeton University. Gedankenverification. Why is perfect LRU reasonable for the filesystem disk cache but not for the VM pages? How does the filesystem know what pages it has?
E N D
Forking, w/ Unified VM Wrapup Vivek Pai / Kai Li Princeton University
Gedankenverification • Why is perfect LRU reasonable for the filesystem disk cache but not for the VM pages? • How does the filesystem know what pages it has? • What are the steps involved prior to a page being demand loaded in a VM system? • What are the possible results of a TLB miss?
Mechanics • Last quiz graded • Midterm on Thursday • In class • Closed everything (no book, no notes, etc) • Apply Occam’s razor when answering questions • Some short answer, some quiz-like questions, maybe some multiple choice or true/false • Don’t panic
Fear Is The Mind-killer “I must not fear. Fear is the mind-killer Fear is the little death that brings total obliteration I will face my fear I will permit it to pass over me and through me And when it has gone past I will turn the inner eye to see its path Where the fear has gone there will be nothing Only I will remain” – Dune, Frank Herbert, specifically from the Bene Gesserit Litany Against Fear
The Big Picture • We’ve talked about single evictions • Most computers are multiprogrammed • Single eviction decision still needed • New concern – allocating resources • How to be “fair enough” and achieve good overall throughput • This is a competitive world – local and global resource allocation decisions
Imagine a Global LRU • Global – across all processes • Idea – when a page is needed, pick the oldest page in the system • Problems? Process mixes? • Interactive processes • Active large-memory sweep processes • Mitigating damage?
Source of Disk Access • VM System • Main memory caches - full image on disk • Filesystem • Even here, caching very useful • New competitive pressure/decisions • How do we allocate memory to these two? • How do we know we’re right?
Partitioning Memory • Originally, specified by administrator • 20% used as filesystem cache by default • On fileservers, admin would set to 80% • Each subsystem owned pages, replaced them • Observation: they’re all basically pages • Why not let them compete? • Result: unified memory systems – file/VM
File Access Efficiency • read(fd, buf, size) • Buffer in process’s memory • Data exists in two places – filesystem cache & process’s memory • Known as “double buffering” • Various scenarios • Many processes read same file • Process wants only parts of a file, but doesn’t know which parts in advance
Result: Memory-Mapped Files Process A Process B Process C Process A Process B Process C File Map File Map File Map File
Lazy Versus Eager • Eager: do things right away • read(fd, buf, size) – returns # bytes read • Bytes must be read before read completes • What happens if size is big? • Lazy: do them as they’re needed • mmap(…) – returns pointer to mapping • Mapping must exist before mmap completes • When/how are bytes read? • What happens if size is big?
Semantics: How Things Behave • What happens when • Two process obtain data (read or mmap) • One process modifies data • Two processes obtain data (read or mmap) • A third process modifies data • The two processes access the data
Being Too Smart… • Assume a unified VM/File scheme • You’ve implemented perfect Global LRU • What happens on a filesystem “dump”?
Amdahl’s Law • Gene Amdahl (IBM, then Amdahl) • Noticed the bottlenecks to speedup • Assume speedup affects one component • New time = (1-not affected) + affected/speedup • In other words, diminishing returns
NT x86 Virtual Address Space Layouts 00000000 Application code Globals Per-thread stacks DLL code 3-GB user space 7FFFFFFF 80000000 Kernel & exec HAL Boot drivers C0000000 C0800000 Process page tables Hyperspace BFFFFFFF C0000000 System cache Paged pool Nonpaged pool 1-GB system space FFFFFFFF FFFFFFFF
Virtual Address Space in Win95 and Win98 00000000 User accessible Unique per process (per application), user mode 7FFFFFFF 80000000 Shared, process-writable (DLLs, shared memory, Win16 applications) Systemwide user mode C0000000 Win95 and Win98 Systemwide kernel mode Operating system (Ring 0 components) FFFFFFFF
Details with VM Management • Create a process’s virtual address space • Allocate page table entries (reserve in NT) • Allocate backing store space (commit in NT) • Put related info into PCB • Destroy a virtual address space • Deallocate all disk pages (decommit in NT) • Deallocate all page table entries (release in NT) • Deallocate all page frames
More Lazy Versus Eager Issues • Assume 1GB of swap space • Assume 6 processes each do buf = malloc(1024*1024*1024); • Should these operations proceed? • What if they memset(buf, 0, 1024*1024)? • What if they memset(buf, 0, 1024*1024*1024)? • This happened in reality: IBM’s AIX OS
Page States (NT) • Active: Part of a working set and a PTE points to it • Transition: I/O in progress (not in any working sets) • Standby: Was in a working set, but removed. A PTE points to it, not modified and invalid. • Modified: Was in a working set, but removed. A PTE points to it, modified and invalid. • Modified no write: Same as modified but no write back • Free: Free with non-zero content • Zeroed: Free with zero content • Bad: hardware errors
Dynamics in NT VM Demand zero fault Page in or allocation Standby list Free list Zero list Bad list Process working set Modified writer Zero thread “Soft” faults Modified list Working set replacement
How To Launch a New Process? • Obvious choice: “start process” system call • But not all processes start the same • “testprogram” versus “testprogram >& outfile” versus “testprogram arg1 arg2 >& outfile” • The “parent” process wants to specify various aspects of the child’s “environment” • Next step: add more parameters to specify environment
Can We Generalize? • What happens as more information gets added to the process’s “environment” – more parameters? New system calls? This gets ugly • What’s the most general way of setting up all of the environment? • So, why not allow process setup at any point? • This is the exec( ) system call (and its variants)
But We Want a Parent and a Child • The exec call “destroys” the current process • So, instead, destroy a copy of the process • The fork( ) call duplicates the current process • Better yet, don’t tightly couple fork and exec • This way, you can customize the child’s environment • So what does fork( ) entail? • Making a copy of everything about the process • Ouch!
What Gets Copied • So far, we’ve covered the following: • VM system • File system • Signals • How do we go about copying this information? • What parts are easy to copy, and what’s hard? • What’s the common case with fork/exec? • What needs to get preserved in this scenario?
Shared Memory • How to destroy a virtual address space? • Link all PTEs • Reference count • How to swap out/in? • Link all PTEs • Operation on all entries • How to pin/unpin? • Link all PTEs • Reference count w . . . . . . Page table . . . Process 1 w Physical pages . . . . . . Page table Process 2
Child’s virtual address space uses the same page mapping as parent’s Make all pages read-only Make child process ready On a read, nothing happens On a write, generates an access fault map to a new page frame copy the page over restart the instruction Copy-On-Write r r . . . . . . Page table . . . Parent process r r Physical pages . . . . . . Page table Child process
Issues of Copy-On-Write • How to destroy an address space • Same as shared memory case? • How to swap in/out? • Same as shared memory • How to pin/unpin • Same as shared memory