250 likes | 371 Views
“I think there is a world market for maybe five computers.”. Thomas Watson Senior, Chairman of IBM, 1943. Architecture Classification. SISD Single Instruction Single Data SIMD Single Instruction Multiple Data MIMD Multiple Instruction Multiple Data MISD Multiple Instruction Single Data.
E N D
“I think there is a world market for maybe five computers.” Thomas Watson Senior, Chairman of IBM, 1943 ICSS531 - Parallel Architecture
Architecture Classification • SISD • Single Instruction Single Data • SIMD • Single Instruction Multiple Data • MIMD • Multiple Instruction Multiple Data • MISD • Multiple Instruction Single Data ICSS531 - Parallel Architecture
Vector Processors • The earliest parallel computers • Pipeline design (MISD) • Typically viewed as SIMD • Important machines include • Cray-1, etc. • CDC Cyber 205 • IBM 3090 Vector ICSS531 - Parallel Architecture
Seymour Cray (1925-1996) • Packaging, including heat removal • High level bit plumbing… getting the bits from I/O, into memory through a processor and back to memory and to I/O • Parallelism • Programming: O/S and compiler • Problems being solved ICSS531 - Parallel Architecture
Cray’s Contributions • Creative and productive during his entire career 1951-1996. • Creator and un-disputed designer of supers from 1960 • Circuits, packaging, and cooling… • “the mini” as a peripheral computer • Established the template for vector supercomputer architecture ICSS531 - Parallel Architecture
Cray’s Attitudes • Didn’t go with paging & segmentation because it slowed computation • In general, would cut loss and move on when an approach didn’t work… • Ignored CMOS and microprocessors until SRC Company design • Went against conventional wisdom ICSS531 - Parallel Architecture
Computers • CDC 6600 (6xxx Series) • Employed “peripheral processors” • Influenced architecture probably more than any other computer • Cray 1 (1/M, 1/S, XMP, YMP, C90, T90) • Cray 2 GaAs… and Cray 3, Cray 4 ICSS531 - Parallel Architecture
Cray XMP/4 ICSS531 - Parallel Architecture
Cray 2 ICSS531 - Parallel Architecture
Vector Processing • Vector processors have high-level operations that work on linear arrays of numbers: vectors ICSS531 - Parallel Architecture
Styles of Vector Architectures • Memory-memory vector processors • All vector operations are memory to memory • Vector-register processors • All vector operations between vector registers • Vector equivalent of load-store architecture • Includes all vector machines since late 1980s • Cray, Convex, Fujitsu, Hitachi, NEC ICSS531 - Parallel Architecture
Components of Vector Processor • Vector Register • Fixed length bank holding a single vector • Has at least 2 read and 1 write ports • Typically 8-32 vector registers, each holding 64-128 64-bit elements • Vector Functional Units • Fully pipelined, start new operation every clock • Typically 4-8 FUs: FP add, FP mult, FP reciprocal, integer add, logical, shift • Scalar Registers • Single element for FP scalar or address ICSS531 - Parallel Architecture
Vector-Register Architecture ICSS531 - Parallel Architecture
Y = a * X + Y ld f0,a addi r4,rx,#512 loop: ld f2,0(rx) multd f2,f0,f2 ld f4,0(ry) add f4,f2,f4 sd 0(ry),f4 addi rx,rx,#8 addi ry,ry,#8 sub r20,r4,rx bnez r20,loop ld f0,a lv v1,rx multv v2,f0,v1 lv v3,ry addv v4,v2,v3 sv ry,r4 ICSS531 - Parallel Architecture
Y = a * X + Y ld f0,a addi r4,rx,#512 loop: ld f2,0(rx) multd f2,f0,f2 ld f4,0(ry) add f4,f2,f4 sd 0(ry),f4 addi rx,rx,#8 addi ry,ry,#8 sub r20,r4,rx bnez r20,loop ld f0,a lv v1,rx multv v2,f0,v1 lv v3,ry addv v4,v2,v3 sv ry,r4 ICSS531 - Parallel Architecture
Y = a * X + Y ld f0,a addi r4,rx,#512 loop: ld f2,0(rx) multd f2,f0,f2 ld f4,0(ry) add f4,f2,f4 sd 0(ry),f4 addi rx,rx,#8 addi ry,ry,#8 sub r20,r4,rx bnez r20,loop ld f0,a lv v1,rx lv v3,ry multv v2,f0,v1 addv v4,v2,v3 sv ry,r4 ICSS531 - Parallel Architecture
CM2 ICSS531 - Parallel Architecture
Basic Organization CM Processors And Memories • Host sends commands & data to microcontroller • Microcontroller broadcasts control signals, data to array • Microcontroller collects data from processor array Host Computer Microcontroller ICSS531 - Parallel Architecture
CM Processors and Memories • Processors and memories are 1 bit wide, memory is bit-addressable • Operation is bit-serial • Fields may be any number of bits, start anywhere • Context bit (flag) of processor determines whether processor is active ICSS531 - Parallel Architecture
Programming Languages • PARIS - PArallel Instruction Set, similar to assembly language • *LISP - Common Lisp extension with explicit parallel operations • C* - C extension with explicit parallel data, implicit parallel operations • CM-Fortran - Fortran 90 variant implemented on CM ICSS531 - Parallel Architecture
CM2 • The heart of the CM2 is the parallel processing unit • Consists of up to 64K processors • Each processors has up to 128KB RAM • Processors are bit serial!! • An interprocessor communications network • One or more sequencers • An interface to one or more front-end computers • Zero or more I/O controllers and/or framebuffers ICSS531 - Parallel Architecture
CM2 System Organization Nexus Front End Connection Machine Processors Connection Machine Processors Sequencer 0 Sequencer 3 Sequencer 1 Sequencer 2 Connection Machine Processors Connection Machine Processors ICSS531 - Parallel Architecture
Interprocessor Network • Each node of the network is a cluster (“chip”) • 16 data processors on the chip • Memory • One router node • The nodes are connected using a 12D hypercube • 4096 nodes, each directly connected to 11 other nodes • Thus the maximum size of a CM is 12 times 4096 or 64K processors ICSS531 - Parallel Architecture
Arith.cs /* Simple arithmetic demonstration - file arith.cs */ #include <stdio.h> #define NPROCS 1048576 shape [NPROCS]A; float:A s, x, y; void main() { int k, i; with ( A ) { x = (rand()/1.0e7) - 60.0; y = (rand()/1.0e7) - 60.0; for ( i = 0; i < 3; i++ ) { CM_start_timer(1); with ( A ) for ( k = 0; k < 200; k++ ) s = x * y; CM_stop_timer(1); CM_reset_timer(); } }}} ICSS531 - Parallel Architecture
CM5 ICSS531 - Parallel Architecture