LAB: PIPELINING

LAB: PIPELINING Electrical and Computer Engineering University of Cyprus 17-10-2007

PIPELINING • Technique in which the execution of several instructions is overlapped. • Each instruction is broken into several stages. • Stages can operate concurrently PROVIDED WE HAVE SEPARATE RESOURCES FOR EACH STAGE! => each step uses a different functional unit. • Note: execution time for a single instruction is NOT improved. Throughput of several instructions is improved.

Major Pipeline Benefit = Performance • Ideal Performance • Time/instruction = non-piped-time/#stages • This is an asymptote of course, but +10% is commonly achieved Two ways to view the performance mechanism • Reduced CPI (i.e. non-piped to piped change) • Close to 1 instruction/cycle if you’re lucky • Reduced cycle-time (i.e. increasing pipeline depth) • Work split into more stages • Simpler stages result in faster clock cycles

Other Pipeline Benefits • Completely hardware mechanism • All modern machines are pipelined • This was the key technique to advancing performance in the 80’s • In the 90’s the move was to multiple pipelines • Beware, no benefit is totally free/good

Implementation of a RISC (Unpipelined, Multicycle) • Implementation of an integer subset of a RISC architecture that takes at most 5 clock cycles. • Instruction Fetch (IF) • Instruction Decode/Register Fetch (ID) • Execution/Effective Address Calculation (EX) • Memory Access (MEM) • Write-Back (WB)

PC Registers ALU IFtch Dcd Exec Mem WB Stage 1 Stage 2 Stage 3 Stage 4 IM DM Reg Reg ALU Review: Single-cycle Datapath for MIPS Stage 5 Instruction Memory (Imem) Data Memory (Dmem) • Use datapath figure to represent pipeline

Classic 5 Stage Pipeline for a RISC Processor

IFtch Dcd Exec Mem WB Pipelined Execution Representation Time IFtch Dcd Exec Mem WB • To simplify pipeline, every instruction takes same number of steps, called stages • One clock cycle per stage IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB Program Flow

Important Pipeline Characteristics • Latency • Time it takes an instruction to go through the pipe • Latency = # stages * stage-delay • Dominant feature if there are a lot of exceptions… • Throughput • Determined by the rate at which instructions can start/finish • Dominant feature if there are few exceptions

Pipelining Lessons • Pipelining doesn’t help latency (execution time) of single task, it helps throughput of entire workload • Multiple tasks operating simultaneously using different resources • Potential speedup = Number of pipe stages • Time to “fill” pipeline and time to “drain” it reduces speedup • Pipeline rate limited by slowest pipeline stage • Unbalanced lengths of pipe stages also reduces speedup

a d d SignExtend Single Cycle Datapath M u x a d d 4 << 2 PCSrc MemWrite 25:21 ReadReg1 Read Addr P C Readdata Readdata1 Zero ReadReg2 31:0 20:16 A L U Instruc- tion Address Readdata2 M u x MemTo- Reg WriteReg M u x Dmem Imem Regs ALU- con WriteData WriteData 15:11 M u x RegDst ALU- src RegWrite MemRead 15:0 ALUOp

Required Changes to Datapath • Introduce registers to separate 5 stages by putting IF/ID, ID/EX, EX/MEM, and MEM/WB registers in the datapath. • Next PC value is computed in the 3rd step, but we need to bring in next instruction in the next cycle. • Branch address is computed in 3rd stage. With pipeline, the PC value has changed! Must carry the PC value along with instruction.

Changes to Datapath. • For lw instruction, we need write register address at stage 5. But the IR is now occupied by another instruction! So, we must carry the IR destination field as we move along the stages. See connection in fig.

Pipelined Datapath (with Pipeline Regs) Fetch Decode Execute Memory Write Back 0 M u x 1 IF/ID EX/MEM ID/EX MEM/WB A d d A d d 4 A d d r e s u l t S h i f t l e f t 2 R e a d n o r e g i s t e r 1 i A d d r e s s P C t R e a d c u d a t a 1 r R e a d t s Z e r o n r e g i s t e r 2 I A L U R e a d A L U 0 R e a d W r i t e A d d r e s s d a t a 2 r e s u l t 1 d a t a r e g i s t e r M M Imem Regs u u W r i t e x x d a t a 1 0 W r i t e Dmem d a t a 1 6 3 2 S i g n e x t e n d 5 69 bits 133 bits 64 bits 102 bits

Can pipelining get us into trouble? • Yes: Pipeline Hazards • structural hazards: attempt to use the same resource two different ways at the same time • control hazards: attempt to make a decision before condition is evaluated • branch instructions • interrupts

Pipelining troubles (cont.) • data hazards: attempt to use item before it is ready • instruction depends on result of prior instruction still in the pipeline • Can always resolve hazards by waiting • pipeline control must detect the hazard • take action (or delay action) to resolve hazards

LAB: PIPELINING

LAB: PIPELINING

Presentation Transcript

Pipelining Datapath

Enhancing Performance with Pipelining

Pipelining: Basic and Intermediate Concepts

Pipelining: Basic and Intermediate Concepts

Chapter 2 Pipelined Processors

Appendix C

CS2100 Computer Organisation http://www.comp.nus.edu.sg/~cs2100/

IA-64: Advanced Loads Speculative Loads Software Pipelining

Lecture 5. MIPS Processor Design Pipelined MIPS #2

Advanced Computer Architectures

Computer Architecture

March 16, 2001 Prof. David A. Patterson Computer Science 252 Spring 2001

Chapter 5 Overview

Pipelining: Basic and Intermediate Concepts

ECE200 – Computer Organization

CS184a: Computer Architecture (Structure and Organization)

Day 2

Pipelining Appendix A

Multiprocessors - Parallel Computing

Lecture 4: Introduction to Advanced Pipelining

Advanced Pipelining