Multiprocessor Architecture

Multiprocessor Architecture 2005.11.14 양회석 梁會石 Hoeseok Yang

Contents • Introduction • Types of multiprocessing • Cluster Systems • Evolution of Shared-Memory System

Introduction • Food-chain of High-Performance Computers

Introduction • Just for fun Linux cluster PVP (Parallel Vector Processor) Linux cluster

Contents • Introduction • Types of multiprocessing • Clustered computing • Shared-memory multiprocessing • Hybrids • Cluster Systems • Evolution of Shared-Memory System

Clustered computing • Each node (processor) has memory and other peripherals • Commodity computers can be used for a node • Each node has system software (OS) P P Mem. I/O Mem. I/O Mem. I/O Mem. I/O P P

Clustered computing • Beneficial when individual application processes are often independent of each other. (e.g. Web, DB server)

Shared-memory multiprocessing • Multiprocessor system with a single operating system • The OS manages the memory in the system as one large structure shared symmetrically among all the processors P P P Mem. I/O

Shared-memory processing • Beneficial for multiple processes sharing data • scientific applications

Hybrids of clustered and shared-memory • Distributed Shared Memory • Implemented like clustered systems but capable of supporting a single OS image across the multiple processors • NUMA c.f) SMP is UMA • SMP clusters • Each node itself is a small SMP.

Contents • Introduction • Types of multiprocessing • Cluster Systems • Evolution of Shared-Memory System

Terminal and server • Generally speaking, the most common form of cluster is simply a network of computers. • terminal – I/O aspect • server – service aspect • workload balancing, degradation from failures matter.

Beowolf (1997) • Beowulf Clusters are scalable performance clusters based on commodity hardware, on a private system network, with open source software (Linux) infrastructure. • simulations, biotechnology, and petro-clusters; financial market modeling, data mining and stream processing; and Internet servers for audio and games …

ParADE [SNU] 1. Intelligent OpenMP translator 2. Explicit message passing primitives 3. Multi-threaded software distributed shared memory (SDSM) 4. Home-based lazy release consistency (HLRC) with migratory home

Distinction from each other • From a programming point of view, the distinction btw shared-memory and clustered type arises from the way of process communicated with each other • Shared memory • Message passing – cluster • explicit programmer’s synchronization needed - barrier

Contents • Introduction • Types of multiprocessing • Cluster Systems • Evolution of Shared-Memory System • Memory Coherence Models • Memory Consistency Models

Memory coherence Models • Memory coherence refers to the visibility of a write to a given memory location by all other processors in the system. • If the order of writes to a given location by one processor is maintained when observed by other processor in the system, it is said to be memory coherent.

50 gets evictedfrom P2 cache P2’s writerequest arrived P2 reads 2@50 P2 requestswrite 2@50 P2 reads 2@cached 50 P2 reads 1@50 50 gets evictedfrom P2 cache P1 requestswrite 1@50 P1 reads 1@cached 50 P1’s writerequest arrived 50 gets evictedfrom P1 cache P1 reads 1@50 P1 reads 2@50 Memory coherence Models 1 2 P1 sees the order “1-1-2”, while P2 sees “2-1-2” : violation of memory coherency

Memory coherence Models • Update • Invalidation Processor initiated Bus initiated PrRd/- PrWt/BusWt V PrRd/BusRd PrWt/BusWt BusWt/- I BusWt/-

Memory coherence Models PrRd/- PrWr/- • MSI, MESI… M PrRd/- PrWr/- PrWr/- M E PrWr/BusRdX PrWr/BusRdX BusRd/Flush BusRdX/Flush PrWr/BusRdX BusRd/Flush PrRd/ S PrWr/BusRdX BusRdX/Flush S BusRdX/Flush PrRd/BusRd(S) PrRd/BusRd PrRd/- BusRd/- BusRdX/- PrRd/BusRd(S) BusRdX/Flush PrRd/ BusRd/Flush I I

Memory Consistency Models • Memory consistency characterizes the order in which accesses by one processor to different locations in memory are observed by another processor.

Memory coherence Models • Sequential Consistency [Lamport] • “A multiprocessor is sequentially consistent if the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program”

O O *c.f) causalconsistency Memory coherence Models • Sequential Consistency P1: W(x)1 P2: R(x)1 R(x)2 P3: R(x)1 R(x)2 P4: W(x)2 P1: W(x)1 P2: R(x)1 R(x)2 P3: R(x)2 R(x)1 P4: W(x)2 O X

Memory Consistency Models • Read-read hazard • May matter in multithreaded programming • P1: W(x)1P2: R(x)1 R(x)0 ? • Write-read hazard • W(x)0… W(x)1 … R(x) …W(x)2what if R(x) reads 0? • Read-write hazard • W(x)0… W(x)1 … R(x) … W(x)2what if R(x) reads 2? • Write-write hazard • P1: W(x)1… W(x)2P2: R(x)2 … R(x)1 ?

Memory Consistency Models • Processor consistency [Goodman] • Writes done by a single processor are received by all other processors in the order in which they were issued, but writes from different processors may be seen in a different order by different processors • Relax WR hazard ; it concerns propagation delay

Memory Consistency Models Processor1 Write A = 1; Read B; Processor2 Write B = 1; Read A; • Ex1 1 1 2 2 A=1,B=0 1 2 1 2 A=1,B=1 Sequential consistency oprocessor consistency o 1 1 2 2 A=1,B=1 1 1 2 2 A=0,B=1 1 2 1 2 A=0,B=0 Sequential consistency xprocessor consistency o 2 2 1 1 2 1 1 2 A=1,B=0

Memory Consistency Models Processor1 A=1; Processor2 while(A==0);B=1; Processor3 while(B==0); Print A; • Ex2 1 1 1 2 2 Only ‘1’ will be printed in sequential consistency: 1 1 2 1 2 ‘0’ could be printed in processor consistency: 2 1 1 2 1 Actually, this is the case that commit A takes longer time than commit B for processor 3.

Memory Consistency Models • Release consistency [Gharachorloo] • Explicit synchronization point • Eager RC vs Lazy RC Commit point P1: Acq. W(x)1 Rel. P2: Acq. R(x) Rel.

Multiprocessor Architecture

Multiprocessor Architecture

Presentation Transcript

Multiprocessor Systems

Multiprocessor Architecture for Image processing

Multiprocessor Scheduling

Multiprocessor Architecture for Image Processing

Multiprocessor Scheduling

Chapter 4 Multiprocessor architecture

Signalling in the Heterogeneous Architecture Multiprocessor Paradigm

Mapping the Data Warehouse to a Multiprocessor Architecture

CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming

Multiprocessor Architectures

CS252 Graduate Computer Architecture Lecture 21 Multiprocessor Networks (con’t)

CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming

Multiprocessor Scheduling

Symmetric multiprocessor

CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming

CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming