Taxanomy of parallel machines

Taxanomy of parallel machines

Control SIMD MIMD Taxonomy of parallel machines • Memory • Shared mem. • Distributed mem.

Shared Memory Multiprocessor

Conventional Computer • Consists of a processor executing a program stored in a (main) memory: • Each main memory location located by its address. Addresses start at 0 and extend to 2b - 1 when there are b bits (binary digits) in address. Main memory Instr uctions (to processor) Data (to or from processor) Processor

Shared Memory Multiprocessor System • Natural way to extend single processor model - have multiple processors connected to multiple memory modules, such that each processor can access any memory module : Memory module One address space Interconnection network Processors

Simplistic view of a small shared memory multiprocessor Processors Bus Shared memory

Typical Shared Memory Multiprocessor Processor Processor Processor Processor L1 cache L1 cache L1 cache L1 cache L2 Cache L2 Cache L2 Cache L2 Cache Bus interface Bus interface Bus interface Bus interface Processor/ memory b us I/O interf ace Memory controller I/O b us Memory Shared memory

Programming Shared Memory Multiprocessors • Threads - programmer decomposes program into individual parallel sequences, (threads), each being able to access variables declared outside threads. Example: Pthreads • Sequential programming language with preprocessor compiler directives to declare shared variables and specify parallelism. • Example: OpenMP or Cilk - needs OpenMP or Cilk compiler

Distributed Memory Multiprocessor

Computers connected through an interconnection network: Interconnection network Messages Processor Local memory Computers

Interconnection Networks • Limited and exhaustive interconnections • 2- and 3-dimensional meshes • Hypercube (not now common) • Using Switches: • Crossbar • Trees • Multistage interconnection networks

Two-dimensional array (mesh) Computer/ Links processor • Also three-dimensional - used in some large high performance systems.

Three-dimensional hypercube

IBM Blue Gene

Tree Root Switch Links element Processors

Four-dimensional hypercube • Hypercubes popular in 1980/90’s - not now

Multistage Interconnection NetworkExample: Omega network 2 ´ 2 switch elements (straight-through or crossover connections) 000 000 001 001 010 010 011 011 Inputs Outputs 100 100 101 101 110 110 111 111

Crossbar switch Memor ies Switches Processors

Message-Passing • Distributed memory parallel machines are usually programmed via message passing. • Industry standard: MPI Interconnection netw or k Messages Processor Shared memory Computers

Flynn’s Classifications

Taxanomy of parallel machines MIMD SIMD Distributed memory CM/2 (legacy) clusters Shared memory multi-core GPU

Taxanomy of parallel machines

Taxanomy of parallel machines

Presentation Transcript

CS 240A: Models of parallel programming: Machines, languages, and complexity measures

Data-Parallel Finite-State Machines

Emulating Massively Parallel (Peta FLOPS ) Machines

Vector Machines Model for Parallel Computation

CS 267: Shared Memory Parallel Machines

Theory of Machines

Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing)

Parallel Virtual Machines in Kepler

Truthful Algorithms for Scheduling Selfish Tasks on Parallel Machines

Parallel Machines and Computations.

Parallel Programming using PVM (Parallel Virtual Machines) Douglas Moore 10 November 2003

Techniques for packet transfer in parallel machines

Compiling for Parallel Machines

CS 267: Introduction to Parallel Machines and Programming Models

USE OF MACHINES

A Fault Tolerant Protocol for Massively Parallel Machines

NUMA Parallel Machines

KINEMATICS OF MACHINES

CS 267: Shared Memory Parallel Machines

CS 267: Introduction to Parallel Machines and Programming Models