120 likes | 260 Views
Cache Coherence in Shared Memory Multiprocessors. Chinmay Ashok. An Introduction to The Problem. Caches allow greater performance by storing frequently used data in faster memory (closer to the processor)
E N D
Cache Coherencein Shared Memory Multiprocessors Chinmay Ashok
An Introduction to The Problem • Caches allow greater performance by storing frequently used data in faster memory (closer to the processor) • Since all processors share the same address space, it is possible for more than one processor to cache an item at a time • If one processor updates the data item without informing the other processors, inconsistencies may result and cause incorrect executions
The Problem P1 P2 P3 4 U : 5 5 U : 5 3 U : 7 $ $ $ U : 5 U : 5 U : 5 1 2 U : 5 I/O devices U : 5 MEMORY
The Requirement • To ensure that whenever a processor reads a memory location, it receives the correct value • Correctness implies that each read from a location should return the last value written to that location • The last value is produced by the latest write in program order
Factors to Consider for Solutions • For correct execution, coherence must be enforced between the caches • Two major factors are: • performance • implementation cost • Four primary design issues are: • coherence detection strategy – incoherent memory accesses • coherence enforcement strategy – invalidate or update • precision of block-sharing information – storage of sharing information • cache block size – granularity
Types of Coherence Mechanisms • Snoopy coherence mechanisms for bus-based multiprocessors (speed of the communication medium) • Directory based coherence mechanisms use a central directory to implement coherence • Compiler directed coherence mechanisms (static) let the compiler detect coherence issues and use special instructions to enforce coherence.
Snoopy Protocols • Specified by • Set of states associated with memory blocks in the local caches • State transition diagram • Actions associates with each state transition • Examples • Valid-Invalid • MSI • MESI • MOESI • Dragon (Update based)
Valid – Invalid Protocol BusRd/- Assumption – Bus transactions are atomic PrRd/- PrWr/BusWr V BusWr/- PrWr/BusWr (Write Allocate) PrRd/BusRd I PrWr/BusWr (Write No-Allocate) BusRd/- BusWr/-
Memory Consistency • For a shared address space this constrains the order in which memory operations must appear to be performed (visibility) • This includes operations to the same locations or to different locations, by the same process or by different processes. • Memory consistency subsumes coherence
Three State MSI Write-Back Invalidation Protocol Assumption – Bus transactions are atomic PrRd/- PrWr/- M PrWr/BusRdX PrWr/BusRdX BusRd/Flush BusRdX/Flush PrRd/BusRd S I BusRd/- BusRdX/- PrRd/- BusRd/- BusRdX/-
Other Snoopy Protocols • MESI – Exclusive State • MOESI – Owned State • Dragon (Update Based) – M, E, Sm, Sc
Conclusion • Cache coherence is necessary • Performance and implementation costs are critical factors while choosing a solution • Types – Snoopy, directory based and compiler directed • Memory consistency models • Standard snoopy protocols – Valid/Invalid, MSI, MESI, MOESI and Dragon