290 likes | 445 Views
Multiprocessor Architecture. 2005.11.14 양회석 梁會石 Hoeseok Yang. Contents. Introduction Types of multiprocessing Cluster Systems Evolution of Shared-Memory System. Introduction. Food-chain of High-Performance Computers. Introduction. Just for fun. Linux cluster.
E N D
Multiprocessor Architecture 2005.11.14 양회석 梁會石 Hoeseok Yang
Contents • Introduction • Types of multiprocessing • Cluster Systems • Evolution of Shared-Memory System
Introduction • Food-chain of High-Performance Computers
Introduction • Just for fun Linux cluster PVP (Parallel Vector Processor) Linux cluster
Contents • Introduction • Types of multiprocessing • Clustered computing • Shared-memory multiprocessing • Hybrids • Cluster Systems • Evolution of Shared-Memory System
Clustered computing • Each node (processor) has memory and other peripherals • Commodity computers can be used for a node • Each node has system software (OS) P P Mem. I/O Mem. I/O Mem. I/O Mem. I/O P P
Clustered computing • Beneficial when individual application processes are often independent of each other. (e.g. Web, DB server)
Shared-memory multiprocessing • Multiprocessor system with a single operating system • The OS manages the memory in the system as one large structure shared symmetrically among all the processors P P P Mem. I/O
Shared-memory processing • Beneficial for multiple processes sharing data • scientific applications
Hybrids of clustered and shared-memory • Distributed Shared Memory • Implemented like clustered systems but capable of supporting a single OS image across the multiple processors • NUMA c.f) SMP is UMA • SMP clusters • Each node itself is a small SMP.
Contents • Introduction • Types of multiprocessing • Cluster Systems • Evolution of Shared-Memory System
Terminal and server • Generally speaking, the most common form of cluster is simply a network of computers. • terminal – I/O aspect • server – service aspect • workload balancing, degradation from failures matter.
Beowolf (1997) • Beowulf Clusters are scalable performance clusters based on commodity hardware, on a private system network, with open source software (Linux) infrastructure. • simulations, biotechnology, and petro-clusters; financial market modeling, data mining and stream processing; and Internet servers for audio and games …
ParADE [SNU] 1. Intelligent OpenMP translator 2. Explicit message passing primitives 3. Multi-threaded software distributed shared memory (SDSM) 4. Home-based lazy release consistency (HLRC) with migratory home
Distinction from each other • From a programming point of view, the distinction btw shared-memory and clustered type arises from the way of process communicated with each other • Shared memory • Message passing – cluster • explicit programmer’s synchronization needed - barrier
Contents • Introduction • Types of multiprocessing • Cluster Systems • Evolution of Shared-Memory System • Memory Coherence Models • Memory Consistency Models
Memory coherence Models • Memory coherence refers to the visibility of a write to a given memory location by all other processors in the system. • If the order of writes to a given location by one processor is maintained when observed by other processor in the system, it is said to be memory coherent.
50 gets evictedfrom P2 cache P2’s writerequest arrived P2 reads 2@50 P2 requestswrite 2@50 P2 reads 2@cached 50 P2 reads 1@50 50 gets evictedfrom P2 cache P1 requestswrite 1@50 P1 reads 1@cached 50 P1’s writerequest arrived 50 gets evictedfrom P1 cache P1 reads 1@50 P1 reads 2@50 Memory coherence Models 1 2 P1 sees the order “1-1-2”, while P2 sees “2-1-2” : violation of memory coherency
Memory coherence Models • Update • Invalidation Processor initiated Bus initiated PrRd/- PrWt/BusWt V PrRd/BusRd PrWt/BusWt BusWt/- I BusWt/-
Memory coherence Models PrRd/- PrWr/- • MSI, MESI… M PrRd/- PrWr/- PrWr/- M E PrWr/BusRdX PrWr/BusRdX BusRd/Flush BusRdX/Flush PrWr/BusRdX BusRd/Flush PrRd/ S PrWr/BusRdX BusRdX/Flush S BusRdX/Flush PrRd/BusRd(S) PrRd/BusRd PrRd/- BusRd/- BusRdX/- PrRd/BusRd(S) BusRdX/Flush PrRd/ BusRd/Flush I I
Memory Consistency Models • Memory consistency characterizes the order in which accesses by one processor to different locations in memory are observed by another processor.
Memory coherence Models • Sequential Consistency [Lamport] • “A multiprocessor is sequentially consistent if the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program”
O O *c.f) causalconsistency Memory coherence Models • Sequential Consistency P1: W(x)1 P2: R(x)1 R(x)2 P3: R(x)1 R(x)2 P4: W(x)2 P1: W(x)1 P2: R(x)1 R(x)2 P3: R(x)2 R(x)1 P4: W(x)2 O X
Memory Consistency Models • Read-read hazard • May matter in multithreaded programming • P1: W(x)1P2: R(x)1 R(x)0 ? • Write-read hazard • W(x)0… W(x)1 … R(x) …W(x)2what if R(x) reads 0? • Read-write hazard • W(x)0… W(x)1 … R(x) … W(x)2what if R(x) reads 2? • Write-write hazard • P1: W(x)1… W(x)2P2: R(x)2 … R(x)1 ?
Memory Consistency Models • Processor consistency [Goodman] • Writes done by a single processor are received by all other processors in the order in which they were issued, but writes from different processors may be seen in a different order by different processors • Relax WR hazard ; it concerns propagation delay
Memory Consistency Models Processor1 Write A = 1; Read B; Processor2 Write B = 1; Read A; • Ex1 1 1 2 2 A=1,B=0 1 2 1 2 A=1,B=1 Sequential consistency oprocessor consistency o 1 1 2 2 A=1,B=1 1 1 2 2 A=0,B=1 1 2 1 2 A=0,B=0 Sequential consistency xprocessor consistency o 2 2 1 1 2 1 1 2 A=1,B=0
Memory Consistency Models Processor1 A=1; Processor2 while(A==0);B=1; Processor3 while(B==0); Print A; • Ex2 1 1 1 2 2 Only ‘1’ will be printed in sequential consistency: 1 1 2 1 2 ‘0’ could be printed in processor consistency: 2 1 1 2 1 Actually, this is the case that commit A takes longer time than commit B for processor 3.
Memory Consistency Models • Release consistency [Gharachorloo] • Explicit synchronization point • Eager RC vs Lazy RC Commit point P1: Acq. W(x)1 Rel. P2: Acq. R(x) Rel.