MIMD Shared Memory

MIMD Shared Memory Multiprocessors

MIMD -- Shared Memory • Each processor has a full CPU • Each processors runs its own code • can be the same program as other processors or different • All processors access the same memory • Same address space for all processors • UMA Uniform Memory Access • all memory accessible in same time for every processor • NUMA Non-Uniform Memory Access • memory is localized • each processor can access some memory faster than other

MIMD - SM - UMA C O N N E C T I O N P R O C E S S O R S M E M O R Y M O D U L E S

Options for Connection -- UMA • Bus • Sequential, can be used for one message at a time • Switching Network • Can send many messages at once • depends on connection scheme • Crossbar • Maximal connections • expensive • Omega (also called Butterfly, Banyan) • several permutations of proc-mem possible

Bus • Needs smart local cache schemes to reduce bus traffic • Works for low number of processors • Depending on technology 20-50 processors overloads bus, performance degrades • Common on 4, 8 processor SMP servers

Bus Memory Bus Cache Processors

Crossbar switch • Every permutation of processor to memory can work • Expensive N*M switches where where N = number of processors, M = Number of memory modules

Crossbar switch M e m o r y Switches Processors

Omega Network • Every Processor Connects to Every Memory • Many, but not all, permutations possible • An Extra stage adds redundancy and more permutations • Number of switches = (N/2) log N • For N processors, N memory modules • Number of stages = log N (determines latency)

Omega Network P r o c e s s o r s M e m o r y

Omega Network 000 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 Destination = 101

Omega Network -- A Permutation 000 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 Destination = 101

Omega Network with combining • Smart Switches • combine two requests with same destination • make memory accesses equivalent to serial sequence • split return values appropriately • Time trade-off • Used in NYU Ultra-computer • also in IBM RP3 experimental machine • Example: Fetch and Increment

Omega Network 000 000 001 001 010 010 011 011 100 100 101 101 110 110 111 111 Destination = 101

Options for Connection -- NUMA • Each Processor has a segment of memory closer than others • Could be several different levels of access • All Processors still use same address space • Omega network with wrap around • BBN Butterfly • Hierarchy of Rings (or other switches) • Kendall Square Research KSR-1 • SGI Origin series

Hierarchical Rings Directory Nodes Compute Node To higher level ring

Issues for MIMD Shared Memory • Memory Access • Can reads be simultaneous? • How to control multiple writes? • Synchronization mechanism needed • semaphores • monitors • Local caches need to be coordinated • cache coherency protocols

MIMD Shared Memory

MIMD Shared Memory

Presentation Transcript

Shared-memory Architectures

Shared Memory Parallelism

Shared Memory and Shared Memory Consistency

Shared Memory Considerations

Computer Architecture Shared Memory MIMD Architectures

Shared Memory Multiprocessors

Shared Memory

Shared Memory Systems

Distributed Shared Memory

MIMD Distributed Memory Architectures

MIMD

Shared memory architectures

Distributed shared memory

Distributed Shared Memory

Shared Memory

Distributed Shared Memory

IPC: Shared Memory

Shared Memory Multiprocessors

Shared Memory Multiprocessors

Shared Memory Multiprocessors

Shared Memory Multiprocessors