1 / 40

Pipelining

This article discusses the concept of pipelining in processor data paths, its benefits in terms of throughput and efficiency, and the challenges and limitations associated with implementing pipelined architectures. It also covers hazards that can occur in pipelined architectures and ways to mitigate them.

sloat
Download Presentation

Pipelining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pipelining

  2. Processor Data Path • Single cycle processor makes poor use of units:

  3. Processor Data Path • ADD r1, r2, r3 running

  4. Processor Data Path • ADD r1, r2, r3 running

  5. Processor Data Path • ADD r1, r2, r3 running

  6. Processor Data Path • ADD r1, r2, r3 running

  7. Processor Data Path • ADD r1, r2, r3 running

  8. Assembly Lines • Single cycle laundry:

  9. Assembly Lines • Assembly line laundry:

  10. Segmented Data Path

  11. Segmented Data Path • Registers to hold values between stages

  12. Pipelined • Each stage can work on different instruction:

  13. Pipeline vs Not: • Pipeline: 4 ins / 8 cycles • No Pipeline: 2 ins / 10 cycles

  14. Throughput • N stage pipeline: • n - 1 cycles to "prime it" • Then one instruction per cycle

  15. Throughput • N stage pipeline: • Time for i instructions in n stage pipeline • Time for i instructions without pipelining

  16. Throughput • N stage pipeline: • Time for i instructions in n stage pipeline • Time for i instructions without pipelining • Max Speedup: as = n

  17. Pipelining Limits • In theory:n times speedup for n stage pipeline • But • Only if all stages are balanced • Only if can be kept full

  18. Weak Link & Latency • Total data path = 800ps

  19. Weak Link & Latency • Pipelined : can't run faster than slowest step

  20. Weak Link & Latency • Pipelined : can't run faster than slowest step5 x 200ps = 1000ps • Plus delay of memory betweenstages

  21. Pipeline vs Not Clock time 800psno pipeline 200ps pipeline

  22. Weak Link & Latency • First Instruction • No-pipeline: 800ps / 1 instruction • Pipeline: 1000ps / 1 instruction • "Speedup" on first instruction : 0.8x (25% slower) Increased Latency

  23. Weak Link & Latency • Full Pipeline • No-pipeline: 800ps / 1 instruction • Pipeline: 1000ps / 5 instructions = 200 ps / inst • Speedup with full pipeline = = 4x Increased Throughput

  24. Designed for Pipelining • Consistent instruction length • Simple decode logic • No feeding data from memory to ALU

  25. Hazards • Hazard : Situation preventing next instruction from continuing in pipeline • Structural : Resource (shared hardware) conflict • Data : Needed data not ready • Control : Correct action depends on earlier instruction

  26. Structural Hazards • What if one memory?IF and MEMaccess sameunit Mem Mem

  27. Structural Hazards • Conflict between MEM and IF

  28. Dealing with Conflict • Bubble : Unused pipeline stage MOV Bubble LDR SUB ADD

  29. Dealing with Conflict • Bubbles to handle shared memory

  30. Avoiding Structural Hazards • Separate Inst/Data cache • Can’t send memory data to ALU

  31. Data Hazards • Sequence of instructions to be executed:

  32. Data Hazards • RAW : Read After Write • Later instruction depends on result from earlier • ADD writes R1 at time 5 • SUB wants r1 at time 3

  33. Dealing with Data Hazards • Option 1 : NOP = No op = Bubble • Assuming can read new value of r1 as being written : 2 cycles of bubble (otherwise 3)

  34. Dealing with Data Hazards • Option 2 : Clever compiler/programmer reorders instructions: • 1 Bubble eliminated by LDR before SUB

  35. Reorder = New Problems • While reordering, need to maintain critical ordering: • RAW : Read after WriteADD r1, r3, r4ADD r2, r1, r0 • WAR : Write after ReadADD r2, r1, r0ADD r1, r3, r4 • WAW : Write after WriteADD r1, r4, r0ADD r1, r3, r4

  36. Dealing with Data Hazards • Option 3 : Forwarding • Shortcut to send results back to earlier stages

  37. Dealing with Data Hazards • r1’s value forwarded to ALU

  38. Dealing with Data Hazards • Forwarding may not eliminate all bubbles

  39. Dealing with Data Hazards • Requires complex hardware • Potentially slows down pipeline

  40. Pipeline History • Pipelines:

More Related