90 likes | 108 Views
EKT303/4. Superscalar vs Super-pipelined. Scalar to SUPERSCALAR.
E N D
EKT303/4 Superscalar vs Super-pipelined
Scalar to SUPERSCALAR • The simplest processors are scalar processors. Each instruction executed by a scalar processor typically manipulates one or two data items at a time. By contrast, each instruction executed by a vector processor operates simultaneously on many data items. • A superscalar processor is a mixture of the two. Each instruction processes one data item, but there are multiple functional units within each CPU thus multiple instructions can be processing separate data items concurrently. • Superscalar CPU design emphasizes improving the instruction dispatcher accuracy, and allowing it to keep the multiple functional units in use at all times.
Scalar to SUPERSCALAR • Early, superscalar CPUs would have 2 ALUs and a single FPU, a modern design such as the PowerPC 970 includes 4 ALUs, 2 FPUs, and 2 SIMD units. If the dispatcher is ineffective at keeping all of these units fed with instructions, the performance of the system will suffer. • Usually, a superscalar processor sustains an execution rate in excess of one instruction per machine cycle. But merely processing multiple instructions concurrently does not make an architecture superscalar, since pipelined, multiprocessors or multicore architectures also achieve that with different methods. • In a superscalar CPU the dispatcher reads instructions from memory and decides which ones can be run in parallel, dispatching each to one of the several FUs contained inside a single CPU. Therefore a superscalar processor can be envisioned having multiple parallel pipelines, each of which is processing instructions simultaneously from a single instruction thread.
SUPERSCALAR • A superscalar CPU architecture implements a form of parallelism called instruction-level parallelism within a single processor. This allows faster CPU throughput than would otherwise be possible at a given clock rate. • A superscalar processor executes more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different FUs on the processor. Each FU is not a separate CPU core but an execution resource within a single CPU such as an ALU, a bit shifter, or a multiplier. • In Flynn’s taxonomy, a single-core superscalar processor is classified as an SIMD processor, while a multi-core superscalar processor is classified as an MIMD processor.
SUPERSCALAR vs SUPER-PIPELINED • While a superscalar CPU is typically also pipelined, pipelining and superscalar architecture are considered different performance enhancement techniques. • The superscalar technique is traditionally associated with several identifying characteristics (within a given CPU core): • Instructions are issued from a sequential instruction stream • CPU hardware dynamically checks for data dependencies between instructions at run time (versus software checking at compile time) • The CPU processes multiple instructions per clock cycle
SUPERSCALAR vs SUPER-PIPELINED Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions per cycle can be completed. (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory access, WB = Register write back, i = Instruction number, t = Clock cycle [i.e., time])
SUPERSCALAR - Drawbacks Available performance improvement from superscalar techniques is limited by three key areas: • The degree of intrinsic parallelism in the instruction stream (instructions requiring the same computational resources from the CPU). • The complexity and time cost of the dispatcher and associated dependency checking logic. • The branch instruction processing.
SUPER-PIPELINED • In contrast to a superscalar processor, a superpipelined one has split the main computational pipeline into more stages. Each stage is simpler (does less work) and thus the clock speed can be increased. • However the latency, measured in clock cycles, for any instruction to complete has increased from 4 cycles in early RISC processors to 8 or more. • Benefit: • The major benefit of superpipelining is the increase in the number of instructions which can be in the pipeline at one time and hence the level of parallelism. • Drawbacks • The larger number of instructions "in flight" (ie in some part of the pipeline) at any time, increases the potential for data dependencies to introduce stalls. Simulation studies have suggested that a pipeline depth of more than 8 stages tends to be counter-productive.
COMPARISON • Superscalar machines can issue several instructions per cycle. Superpipelined machines can issue only one instruction per cycle, but they have cycle times shorter than the time required for any operation. • Both of these techniques exploit instruction-level parallelism, which is often limited in many applications. Superpipelined machines are shown to have better performance and less cost than superscalar machines.