570 likes | 909 Views
به نام خدا. شبکه های میان ارتباطی. دکتر محمد کاظم اکبری مرتضی سرگلزایی جوان. http://crc.aut.ac.ir. Taxonomy. MIMD. P2. P1. Pn. Processors. Interconnection Network. IN. Memory modules. M1. M2. Mn. Multiprocessor (shared memory). (Tightly Coupled Architecture). Shared Memory.
E N D
به نام خدا شبکه های میان ارتباطی دکتر محمد کاظم اکبری مرتضی سرگلزایی جوان http://crc.aut.ac.ir
MIMD P2 P1 Pn Processors Interconnection Network IN Memory modules M1 M2 Mn Multiprocessor (shared memory) • (Tightly Coupled Architecture)
Shared Memory • Uniform Memory Access (UMA) • Tightly Coupled system • Non-Uniform Memory Access (NUMA) • Loosely Coupled system • Cedar from University of Illinois • BBN Butterfly • Cache Only Memory Access (COMA) • Using global distributed caches • Kendal Square Research-1 (KSR-1) 4
C I N C I N C I N P1 P1 P1 CM1 CM1 CM1 P2 P2 P2 CM2 CM2 CM2 Pn Pn Pn CM3 CM3 CM3 MIMD (cont.) Global Memory GM1 Global Memory GM2 Global Memory GMn Global Interconnection Network (Global IN) • (Loosely Coupled Architecture) - Cedar
MIMD (cont.) P1 M1 P2 M2 Interconnection Network (IN) Pn Mn • (Loosely Coupled Architecture)– BBN Butterfly
D1 D2 Dn C1 C2 Cn P1 P2 Pn MIMD (cont.) Interconnection Network (IN) • (COMA Architecture)
MIMD (cont.) IN P2 Pn P1 Multicomputer (Message passing) M2 Mn M1
MIMD (cont.) • Data flow machine • an instruction is ready for execution when data for its operands have been made available • Purely self-contained • No program counter
Array Processor centralized control unit SIMD
MISD Pipelined vector processor
MISD (cont.) • Systolic array
Hybrid Architecture • Combine features of different architectures to provide better performance for parallel computations. • Two type of parallelism • Control parallelism (MIMD) • Data parallelism (SIMD)
Special Purpose Devices Artificial Neural Networks (ANN) Fuzzy logic
Neural Networks (Definition) Power of Processors vs Power of Connectivity • A large number of PEs • Connected in Parallel • Capable of learning • Adaptive to changing • Able to cope with serious disruptions
Fuzzy logic (Definition) • Approximate reasoning • Formal principals of reasoning
Interconnection Network (IN) • The measure of an IN is “how quickly it can deliver how much of what’s needed to the right place, reliably and at good cost and value”.
Performance Criteria for IN • Latency • Transit time for a single msg. • Bandwidth • how much msg. traffic the IN can handle, e.g., Mbytes/s • Connectivity • How many immediate neighbors each node has, and how often each neighbor can be reached • Hardware cost • What fraction of the total hardware cost the IN represents E.g., wires, switches, connectors, arbitration logic, …
Performance Criteria for IN (cont.) • Reliability • Redundancy paths, • Functionality • Additional functions performed by the IN, such as combining of msg. and fault tolerance • e.g., data routing, interrupt handling, request/ message combining, coherence • Scalability • The ability to be expandable
Definitions • Node degree: • node degree is the number of links (edges) connected to the node • Diameter: • the diameter of a network is defined as the largest minimum distance between any pair of nodes. The minimum distance between a pair of nodes is the minimum number of communication links (hops) that data from one of the nodes must traverse in order to reach the other node. • Network Size • The number of nodes in the IN
Data Routing • Functions in data routing • Shifting • Rotation • Permutation (one-to-one) • Broadcast (one-to-all) • Multicast (many-to-many) • Personalized communication (one-to-many) • Shuffle / Exchange
Types of IN Static Networks Dynamic Networks
Static Networks • Shared Bus • Degree = 1 • Diameter = 1
Static Networks (cont.) • Linear Array • Degree = 2 • Diameter = n-1
Static Networks (cont.) • Ring • Degree = 2 • Diameter: • unidirectional: n-1 • bidirectional: Ceil(n-1)/2
Static Networks (cont.) • Binary tree • Degree: • Leaf=1 • Root=2 • Others=3 • Diameter: 2(h-1)
Static Networks (cont.) • Fat tree. • Degree and Diameter is the same as binary tree • Due to heavy traffic towards root, the number of links gradually increases (e.g., CM-5).
Star. Degree: Central = n-1 Others = 1 Diameter= 2 Static Networks (cont.)
Static Networks (cont.) Source Destination 000 000 001 010 010 100 111 111 100 001 101 011 110 101 011 110 Shuffle(sn-1sn-2 ... s0) = sn-2sn-3 ... s0sn-1 Exchange(sn-1sn-2 ... s1s0) = sn-1sn-2 ... s1s0
Shuffle-Exchange Network • For N=8 • Applications: • The shuffle-exchange network provides suitable interconnection patterns for implementing certain parallel algorithms, such as polynomial evaluation, Fast Fourier Transform (FFT), sorting, and matrix transposition.
Static Networks (cont.) • Mesh. • Degree: • Corner= 2 • Sides = 3 • Middle= 4 • Diameter= 2(n-1)
Mesh Routing Algorithm • Simple routing algorithm routes a packet from source Sto destination D in a mesh with n2 nodes. 1. Compute the row distance R as 2. Compute the column distance C as 3. Add the values R and C to the packet header at the source node. 4. Starting from the source, send the packet for R rows and then for C columns.
Example (Mesh) • to route a packet from node 6 (i.e., S=6) to node 12 (i.e., D =12), • the packet goes through two paths, as shown in the figure:
Static Networks (cont.) • Illiac • Degree= 4 • Diameter= n-1 chordal ring
Static Networks (cont.) • Torus • Degree= 4 • Diameter= 2(Ceil(n/2))
Static Networks (cont.) • HyperCube • Degree= n • Diameter= n • Address Bits= n • Dimensions= n • Neighbors= n
Example Embedding a 4-by-4 mesh in a 4-cube
Static Networks (cont.) • n-Mesh • Degree: • Corner= n • Internal= 2n • n < Others < 2n • Diameter=
Static Networks (cont.) • k-Ary n-cube • Degree: • If k=2 then Degree = n • If k>2 then Degree = 2n • Diameter= • (a) 4-ary 2-cube network • (b) 3-ary 3-cube network
Cache Coherence Multiprocessorenvironment Cache dedicated to each processor Cache coherence problem How to keep multiple copies of the data consistent during execution?
Cache Coherence Mechanisms • Hardware-based schemes • Snoopy cache protocols • If INs have broadcast features • Directory cache protocols • No broadcast features in INs • Software-based schemes • Combination
Cache Coherence Mechanisms (cont.) • Action taken on • Read Miss • Write Hit • Write Miss
Snoopy Cache Protocol A two-processor configuration with copies of data block x • write-through • write-back
Centralized Directory Protocols Full-map protocol directory
Dynamic Networks (Single-Stage) In Single-Stage Network any permutation can be reached by at most 3(logN2) -1 pass.
Multi Stages - Blocking Example: Multi Stage Cube , Omega
Multi Stages – Nonblocking Example: Three-stage Clos