300 likes | 334 Views
Chapter 2 Styles of Architecture. 2.1 Parallel Processing Models and Terminology. Terminology Single instruction stream Single data stream. 2.1 Parallel Processing Models and Terminology (continued). Motivation of Parallel Processing
E N D
Chapter 2 Styles of Architecture 병렬처리시스템
2.1 Parallel Processing Models and Terminology • Terminology • Single instruction stream • Single data stream 병렬처리시스템
2.1 Parallel Processing Models and Terminology (continued) • Motivation of Parallel Processing • To enhance the throughput of systems by dedicating each processor to particular function and by operating as many processors simultaneously as possible. 병렬처리시스템
2.1.1 Effect of Application on the Architecture • Degree of parallelism: the number of computations that can be executed concurrently. • For the processing to be most efficient, the following relation should be satisfied: H C L A 병렬처리시스템
2.1.2 Application Characteristics • Four application characteristics of parallel processing to be considered in evaluating performance: • granularity: coarse, medium, fine • degree of parallelism: a measure of the number of threads • level of parallelism: procedure, task, instruction, operation, microcode • data dependency: the result of precedence of constraint between operations. 병렬처리시스템
2.1.2 Application Characteristics(continued) • F = A*B +C*D • What kinds of operation are needed? • How many operations are required? • What is the degree of parallelism? • Why the maximum parallelism is not accomplished? 병렬처리시스템
2.1.3 Performance • Trivial parallelism • Overhead for parallel execution • communication between tasks • allocation of tasks to processors • control of execution of multiple tasks • How to maximize the throughput of the system: compromise 병렬처리시스템
2.1.4 Processing Paradigms • Completely serial: degree of parallelism =1 (Figure 2.1) • Serial-parallel-serial without data dependencies(supervisor/worker model) : easy(or trivial) parallelism (Figure 2.2) • Serial-parallel-serial with data dependencies: communication-bound model(Figure 2.3) 병렬처리시스템
2.2 Taxonomy • Taxonomy based on what drives the computational flow by Milutinovic • control-driven: RISC, CISC, HLL architectures • data-driven: dataflow architectures • demand-driven: reduction architectures 병렬처리시스템
2.2 Taxonomy (continued) • Flynn’s taxonomy • SISD (uniprocessor) • SIMD (array processors) • MISD (not practical) • MIMD(multiprocessor system) 병렬처리시스템
2.2.1 SIMD(Figure 2.5) • CP (Central Processor): a full-fledged processor • retrieves instructions from memory, sends instructions to processors, and executes control instructions. • P1 through Pn • execute the same instructions, each on its own data stream. 병렬처리시스템
2.2.1 SIMD(continued) • The most important characteristic of SIMD is that the arithmetic processors are synchronized at the instruction level. (data parallel architecture) • SIMDs are sometime called array processors because computations involving arrays of data are natural targets for this class of architecture). 병렬처리시스템
2.2.2 MIMD • Figure 2.6 shows a shared memory MIMD structure. • Advantages of MIMDs • a high throughput if processing can be broken into parallel streams • a degree of fault tolerance • possibility of dynamic reconfiguration 병렬처리시스템
2.2.2 MIMD(continued) • Major issues in the design of MIMD system • processor scheduling • processor synchronization • interconnection network • overhead • partitioning 병렬처리시스템
2.3 Skillcorn’s Taxonomy • computer system models • computational model • abstract machine model • performance model • implementation model • The abstract machine model forms the top level in Skillcorn’s taxonomy. 병렬처리시스템
2.3 Skillcorn’s Taxonomy (continued) • Four types of functional units to construct an abstract machine • IP • DP • DM and IM • A switch 병렬처리시스템
2.3 Skillcorn’s Taxonomy (continued) • Table 2.1 shows a set of possible architectures • Class 1-5: reduction/dataflow architecture • Class 6: von Neumann uniprocessor • Class 7-10: SIMDs • Class 11-12: MISDs • Class 13-20: MIMDs • Class 21-28: Unexplored MIMDs 병렬처리시스템
2.3.1 SISD • Figure 2.7 shows the abstract machine level model of an SISD architecture. • IP • to determine the address of the instruction in the IM • To inform the DP of the operation required, determines the address of operands and passes them to the DP. • DP • To get the results from memory and provide them to the IP • Memory Hierarchy • To retain the next pieces of data required by the processor. • SW : not needed 병렬처리시스템
2.3.1 SISD(continued) • Figure 2.8 shows the implementation level details of SISD structure. • IP • DP • ( C ) two operations in parallel 병렬처리시스템
Connections • 1-to-1 • n-to-n : a 1-to-1 connection duplicated n times. • 1-to-n : One FU connects to all the n devices of another set. • n-by-n : one of n can be connected any device in the other sides 병렬처리시스템
How to enhance the performance of SISD • Optimizing state diagrams • Allowing more than one state to be active at a time. 병렬처리시스템
2.3.2 SIMD • Figure 2.9 shows three SIMD models. • All models have a 1-n switch between a single IP and DPs • Three types • A single memory hierarchy • Separate IM and DM • The DP interconnections are n-by-n and the DP to DM interconnections are n-to-n. • no direct between DPs and the DP to DM interconnections are n-by-n. 병렬처리시스템
2.3.3. MIMD • Figure 2.10 shows two MIMD models • tightly coupled • UMA • Shared memory • The ease of programming • loosely coupled • NUMA • Private memory • Message passing • Better scalability 병렬처리시스템
2.3.4 Reduction Architectures • Figure 2.11 shows the abstract machine models for a reduction machine(demand driven). • Reduction architectures employ either a string or graph reduction scheme. • Figure 2.12 (a) and (b) show an example to get the result of the equation , a = (d+e)+(f*g) 병렬처리시스템
2.3.5 Dataflow Architectures • Figure 2.11 and 2.14 represent the abstract machine models for a dataflow machine(data-driven). • Static : Figure 2.11 • Dynamic : Figure 2.14 병렬처리시스템
2.4 Duncan’s Taxonomy • This taxomony classifies parallel architectures into the following classes: • Synchronous architectures • Vector processors • SIMD architectures(processor arrays, associative processors) • Systolic architectures • MIMD achitectures(distributed memory, shared memory) • MIMD paradigm(MIMD/SIMD, Dataflow, Reduction, Wavefront) 병렬처리시스템
2.4.1 Vector Processors • Figure 2.15 shows a vector processor architecture.--> chaining • register-based (Figure 2.16) • memory-based • One variant SIMD, bit-plane array processing is shown in Figure 2.17. 병렬처리시스템
2.4.2 Associative Processors • An associative processor is an SIMD whose main component is an associative memory(AM). • AMs are used in fast search operations. • Figure 2.18 shows the structure and operations of an AM. • An associative processor(Figure 2.19) is formed by including an ALU at each word of an AM. 병렬처리시스템
2.4.3 Systolic Architectures • Systolic arrays are pipelined multiprocesssors. (Figure 2.22) • Figure 2.23 shows an systoic matrix multiplication. 병렬처리시스템
2.4.4.MIMD/SIMD2.4.5 Wave Front Architectures • Figure 2.24 shows an MIMD/SIMD operation. • Wave front architectures combine systolic data pipeline with an asynchronous dataflow execution paradigm. 병렬처리시스템