210 likes | 361 Views
Concurrency case studies in UNIX. John Chapin 6.894 October 26, 1998. OS kernel inherently concurrent. From 60s: multiprogramming Context switch on I/O wait Reentrant interrupts Threads simplified the implementation 90s servers, 00s PCs: multiprocessing Multiple CPUs executing kernel code.
E N D
Concurrency case studies in UNIX John Chapin6.894October 26, 1998
OS kernel inherently concurrent • From 60s: multiprogramming • Context switch on I/O wait • Reentrant interrupts • Threads simplified the implementation • 90s servers, 00s PCs: multiprocessing • Multiple CPUs executing kernel code
Thread (-centric concurrency) control • Single CPU kernel: • Only one thread in kernel at a time • No locks • Disable interrupts to control concurrency • MP kernels inherit this mindset 1. Control concurrency of threads 2. Add locks to objects only where required
Case study: memory mapping • Background • Page faults • challenges • pseudocode • Page victimization • challenges • pseudocode • Discussion & design lessons
Other interesting patterns • nonblocking queues • asymmetric reader-writer locks • one lock/object, different lock for chain • priority donation locks • immutable message buffers
Segmentlist Page mapping --- background virtual addr space pfdat array physicalmemory
Life cycle of a page frame invalid IO_pending Allocate Read from disk unallocated Victimize valid Write todisk Modify Victimize pushout dirty
Page fault challenges • Multiple processes fault to same file page • Multiple processes fault to same pfdat • Multiple threads of same process fault to same segment (low frequency) • Bidirectional mapping between segment pointers and pfdats • Stop only minimal process set during disk I/O • Minimize locking/unlocking on fast path
Page fault stage 1 • vfault(virtual_address addr) • segment.lock(); • if ((pfdat = segment.fetch(addr)) == null) • pfdat = lookup(s.file, s.pageNum(addr)); • /* returns locked pfdat */ • if (pfdat.status == PUSHOUT) • /* do something complicated */ • install pfdat in segment; • add segment to pfdat owner list; • else • pfdat.lock();
Page fault stage 2 • if (pfdat.status == IO_PENDING) • segment.unlock(); • pfdat.wait(); • goto top of vfault; • else if (pfdat.status == INVALID) • pfdat.status = IO_PENDING; • pfdat.unlock(); • fetch_from_disk(pfdat); • pfdat.lock(); • pfdat.status = VALID; • pfdat.notify_all();
Page fault stage 3 • segment.insert_TLB(addr, pfdat.paddr()); • pfdat.unlock(); • segment.unlock(); • restart application
Page victimization challenges • Bidirectional mapping between segment pointers and pfdats • Stop no processes during batch writes • Deadlock caused by paging thread racing with faulting thread
Page victimization stage 1 • next_victim: • pfdat p = choose_victim(); • p.lock(); • if (! (p.status == valid • || p.status == dirty)) • p.unlock(); • goto next_victim;
Page victimization stage 2 • foreach segment s in p.owner_list • if (s.trylock() == ALREADY_LOCKED) • p.unlock(); • /* do something! (p.r.d.) */ • remove p from s; • /* also deletes any TLB mappings */ • delete s from p.owner_list; • s.unlock();
Page victimization stage 3 • if (p.status == DIRTY) • p.status = PUSHOUT; • schedule p for disk write; • p.unlock(); • goto next_victim; • else • unbind(p.file, p.pageNum, p); • p.status = UNALLOCATED; • add_to_free_list(p); • p.unlock();
Discussion questions (1) • Why have IO_PENDING state; why not just keep pfdat locked until data valid? • What happens when: • Some thread discovers IO_PENDING and blocks. Before it restarts, that page is victimized. • Page chosen as victim is being actively used by application code.
Discussion questions (2) • What mechanisms ensure that a page is only read from disk once despite multiple processes faulting at the same time? • Why is it safe to skip checking for PUSHOUT in fault stage 2? • Write out the invariants that support your reasoning.
Discussion questions (3) • Louis Reasoner suggests releasing the segment lock at the end of fault stg 1 and reacquiring it for stg 3. This will speed up parallel threads. What could go wrong? • At the point marked p.r.d. (victim stg 2), Louis suggestsgoto next_victim;What could go wrong?
Design lessons • Causes of complexity: • data structure traversed in multiple directions • high level of concurrency for performance • Symptoms of complexity • nontrivial mapping from locks to objects • invariants relating thread, lock, and object states across multiple data structures
Loose vs tight concurrency • Loose • Separate subsystems connected by simple protocols • Use often, for performance or simplicity • Tight • Shared data structures with complex invariants • Only use where you have to • Minimize code and states involved
Page frame sample invariants • All pfdat p: • (p.status == UNALLOCATED) • || lookup(p.file, p.pageNum) == p • ; all processes will find same pfdat • p.status != INVALID • ; therefore only 1 process will read disk • (p.status == UNALLOCATED • || p.status == PUSHOUT) • => p.owner_list empty • ; therefore no TLB mappings to PUSHOUT • ; avoiding cache consistency problems