510 likes | 635 Views
Distributed Operating Systems CS551. Colorado State University at Lockheed-Martin Lecture 4 -- Spring 2001. CS551: Lecture 4. Topics Memory Management Simple Shared Distributed Migration Concurrency Control Mutex and Critical Regions Semaphores Monitors.
E N D
Distributed Operating SystemsCS551 Colorado State University at Lockheed-Martin Lecture 4 -- Spring 2001
CS551: Lecture 4 • Topics • Memory Management • Simple • Shared • Distributed • Migration • Concurrency Control • Mutex and Critical Regions • Semaphores • Monitors CS-551, Lecture 4
Centralized Memory Management • Review • Memory: cache, RAM, auxiliary • Virtual Memory • Pages and Segments • Internal/External Fragmentation • Page Replacement Algorithm • Page Faults => Thrashing • FIFO, NRU, LRU • Second Chance; Lazy (Dirty Pages) CS-551, Lecture 4
Figure 4.1 Fragmentation in Page-Based Memory versus a Segment-Based Memory. (Galli, p.83) CS-551, Lecture 4
Figure 4.2 Algorithms for Choosing Segment Location. (Galli,p.84) CS-551, Lecture 4
Simple Memory Model • Used in parallel NUMA systems • Access times equal for all processors • Too many processors • => thrashing • => need for lots of memory • High performance parallel computers • May not use cache -- to avoid overhead • May not use virtual memory CS-551, Lecture 4
Shared Memory Model • Shared memory can be a means of interprocess communication • Virtual memory with multiple physical memories, caches, and secondary storage • Easy to partition data for parallel processing • Easy migration for load balancing • Example systems: • Amoeba: shared segments on same system • Unix System V: sys/shm.h CS-551, Lecture 4
Shared Memory via Bus P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 Shared Memory CS-551, Lecture 4
Shared Memory Disadvantages • All processors read/write common memory • Requires concurrency control • Processors may be linked by a bus • Too much memory activity may cause bus contention • Bus can be a bottleneck • Each processor may have own cache • => cache coherency (consistency) problems • Snoopy (snooping) cache is a solution CS-551, Lecture 4
Bused Shared Memory w/Caches P1 P2 P3 P4 P5 cache cache cache cache cache cache cache cache cache cache P6 P7 P8 P9 P10 Shared Memory CS-551, Lecture 4
Shared Memory Performance • Try to overlap communication and computation • Try to prefetch data from memory • Try to migrate processes to processors that hold needed data in local memory • Page scanner • Bused shared memory does not scale well • More processors => bus contention • Faster processors => bus contention CS-551, Lecture 4
Figure 4.3 Snoopy Cache.(Galli,p.89) CS-551, Lecture 4
Cache Coherency (Consistency) • Want local caches to have consistent data • If two processor caches contain same data, the data should have the same value • If not, caches are not coherent • But what if one/both processors change the data value? • Mark modified cache value as dirty • Snoopy cache picks up new value as it is written to memory CS-551, Lecture 4
Cache Consistency Protocols • Write-through protocol • Write-back protocol • Write-once protocol • Cache block invalid, dirty, or clean • Cache ownership • All caches snoop • Protocol part of MMU • Performs within a memory cycle CS-551, Lecture 4
Write-through protocol Read-miss • Fetch data from memory to cache Read hit • Fetch data from local cache Write miss • Update data in memory and store in cache Write hit • Update memory and cache • Other local processors invalidate cache entry CS-551, Lecture 4
Distributed Shared Memory • NUMA • Global address space • All memories together form one global memory • True multiprocessors • Maintains directory service • NORMA • Specialized message-passing network • Example: workstations on a LAN CS-551, Lecture 4
Distributed Shared Memory P1 P2 P3 P4 P5 memory memory memory memory memory memory memory memory memory memory memory P6 P7 P8 P9 P10 P11 CS-551, Lecture 4
How to distribute shared data? • How to distribute shared data? • How many readers and writers are allowed for a given set of data? • Two approaches • Replication • Data copied to different processors that need it • Migration • Data moved to different processors that need it CS-551, Lecture 4
Single Reader / Single Writer • No concurrent use of shared data • Data use may be a bottleneck • Static CS-551, Lecture 4
Multiple Reader /Single Writer • Readers may have a invalid copy after the writer writes a new value • Protocol must have an invalidation method • Copy set: list of processors that have a copy of a memory location • Implementation • centralized, distributed, or combination CS-551, Lecture 4
Centralized MR/SW • One server • Processes all requests • Maintains all data and data locations • Increases traffic near server • Potential bottleneck • Server must perform more work than others • Potential bottleneck CS-551, Lecture 4
Figure 4.4 Centralized Server for Multiple Reader/Single Writer DSM.(Galli,p.92) CS-551, Lecture 4
Partially distributed centralization of MR/SW • Distribution of data static • One server receives all requests • Requests sent to processor with desired data • Handles requests • Notifies readers of invalid data CS-551, Lecture 4
Figure 4.5 Partially Distributed Invalidation for Multiple Reader/Single Writer DSM. (Galli, p.92) [Read X as C below] CS-551, Lecture 4
Dynamic distributed MR/SW • Data may move to different processor • Send broadcast message for all requests in order to reach current owner of data • Increases number of messages in system • More overhead • More work for entire system CS-551, Lecture 4
Ffigure 4.6 Dynamic Distributed Multiple Reader/Single Writer DSM.(Galli, P.93) CS-551, Lecture 4
A Static Distributed Method • Data is distributed statically • Data owner • Handles all requests • Notifies readers when their data copy invalid • All processors know where all data is located, since it is statically located CS-551, Lecture 4
Figure 4.7 Dynamic Data Allocation for Multiple Reader/Single Writer DSM.(Galli,p.96) CS-551, Lecture 4
Multiple Readers/Multiple Writers • Complex algorithms • Use sequencers • Time read • Time written • May be centralized or distributed CS-551, Lecture 4
DSM Performance Issues • Thrashing (in a DSM): “when multiple locations desire to modify a common data set” (Galli) • False sharing: Two or more processors fighting to write to the same page, but not the same data • One solution: temporarily freeze a page so one processor can get some work done on it • Another: proper block size (= page size?) CS-551, Lecture 4
More DSM Performance Issues • Data location (compiler?) • Data access patterns • Synchronization • Real-time systems issue? • Implementation: • Hardware? • Software? CS-551, Lecture 4
Mach Operating System • Uses virtual memory, distributed shared memory • Mach kernel supports memory objects • “a contiguous repository of data, indexed by byte, upon which various operations, such as read and write, can be performed. Memory objects act as a secondary storage …. Mach allows several primitives to map a virtual memory object into an address space of a task. …In Mach, every task has a separate address space.”(Singhal & Shivaratri, 1994) CS-551, Lecture 4
Memory Migration • Time-consuming • Moving virtual memory from one processor to another • When? • How much? CS-551, Lecture 4
MM: Stop and Copy • Least efficient method • Simple • Halt process execution (freeze time) while moving entire process address space and data to new location • Unacceptable to real-time and interactive systems CS-551, Lecture 4
Figure 4.8 Stop-and-Copy Memory Migration. (Galli,p.99) CS-551, Lecture 4
Concurrent Copy • Process continues execution while being copied to new location • Some migrated pages may become dirty • Send over more recent versions of pages • At some point, stop execution and migrate remaining data • Algorithms include dirty page ratio and/or time criteria to decide when to stop • Wastes time and space CS-551, Lecture 4
Figure 4.9 Concurrent-Copy Memory Migration. (Galli,p.99) CS-551, Lecture 4
Copy on Reference • Process stops • All process state information is moved • Process resumes at new location • Other process pages are moved only when accessed by process • Alternate may have virtual memory pages transferred to file server, then moved as needed to new process location CS-551, Lecture 4
Figure 4.10Copy-on-Reference Memory Migration. (Galli,p.100) CS-551, Lecture 4
Table 4.1 Memory Management Choices Available for Advanced Systems. (Galli,p.101) CS-551, Lecture 4
Table 4.2 Performance Choices for Memory Management.(Galli,p.101) CS-551, Lecture 4
Concurrency Control (Chapter 5) • Topics • Mutual Exclusion and Critical Regions • Semaphores • Monitors • Locks • Software Lock Control • Token-Passing Mutual Exclusion • Deadlocks CS-551, Lecture 4
Critical Region • “the portion of code or program accessing a shared resource” • Must prevent concurrent execution by more than one process at a time • Mutex: mutual exclusion CS-551, Lecture 4
Figure 5.1 Critical Regions Protecting a Shared Variable.(Galli,p.106) CS-551, Lecture 4
Mutual Exclusion • Three-point test (Galli) • Solution must ensure that two processes do not enter critical regions at same time • Solution must prevent interference from processes not attempting to enter their critical regions • Solution must prevent starvation CS-551, Lecture 4
Critical Section Solutions • Recall: Silberschatz & Galvin • A solution to the critical section problem must show that • mutual exclusion is preserved • progress requirement is satisfied • bounded-waiting requirement is met CS-551, Lecture 4
Figure 5.2 Example Utilizing Semaphores. (Galli,p.109) CS-551, Lecture 4
Figure 5.3 Atomic Swap. (Galli,p.114) CS-551, Lecture 4
Figure 5.4 Centralized Lock Manager. (Galli,p.116) CS-551, Lecture 4
Figure 5.5 Resource Allocation Graph. (Galli,p.120) CS-551, Lecture 4