1 / 133

Chapter 7 :: Topics

Chapter 7 :: Topics. Introduction Performance Analysis Single-Cycle Processor Multicycle Processor Pipelined Processor Exceptions Advanced Microarchitecture. Microarchitecture: how to implement an architecture in hardware Processor: Datapath : functional blocks

Download Presentation

Chapter 7 :: Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7 :: Topics • Introduction • Performance Analysis • Single-Cycle Processor • Multicycle Processor • Pipelined Processor • Exceptions • Advanced Microarchitecture

  2. Microarchitecture: how to implement an architecture in hardware Processor: Datapath: functional blocks Control: control signals Introduction

  3. Multiple implementations for a single architecture: Single-cycle: Each instruction executes in a single cycle Multicycle: Each instruction is broken into series of shorter steps Pipelined: Each instruction broken up into series of steps & multiple instructions execute at once Microarchitecture

  4. Program execution time Execution Time = (#instructions)(cycles/instruction)(seconds/cycle) Definitions: CPI: Cycles/instruction clock period: seconds/cycle IPC: instructions/cycle = IPC Challenge is to satisfy constraints of: Cost Power Performance Processor Performance

  5. Consider subset of MIPS instructions: R-type instructions: and, or, add, sub, slt Memory instructions: lw, sw Branch instructions: beq MIPS Processor

  6. Determines everything about a processor: PC 32 registers Memory Architectural State

  7. MIPS State Elements

  8. Datapath Control Single-Cycle MIPS Processor

  9. STEP 1:Fetch instruction Single-Cycle Datapath: lw fetch

  10. STEP 2:Read source operands from RF Single-Cycle Datapath: lw Register Read

  11. STEP 3:Sign-extend the immediate Single-Cycle Datapath: lw Immediate

  12. STEP 4: Compute the memory address Single-Cycle Datapath: lw address

  13. STEP 5:Read data from memory and write it back to register file Single-Cycle Datapath: lw Memory Read

  14. STEP 6: Determine address of next instruction Single-Cycle Datapath: lw PC Increment

  15. Write data in rt to memory Single-Cycle Datapath: sw

  16. Read from rs and rt Write ALUResult to register file Write to rd (instead of rt) Single-Cycle Datapath: R-Type

  17. Determine whether values in rs and rt are equal Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4) Single-Cycle Datapath: beq

  18. Single-Cycle Processor

  19. Single-Cycle Control

  20. Review: ALU

  21. Review: ALU

  22. Control Unit: ALU Decoder

  23. Control Unit Main Decoder

  24. Control Unit: Main Decoder

  25. Single-Cycle Datapath: or

  26. Extended Functionality: addi No change to datapath

  27. Control Unit: addi

  28. Control Unit: addi

  29. Extended Functionality: j

  30. Control Unit: Main Decoder

  31. Control Unit: Main Decoder

  32. Program Execution Time = (#instructions)(cycles/instruction)(seconds/cycle) = # instructions x CPI x TC Review: Processor Performance

  33. Single-Cycle Performance TClimited by critical path (lw)

  34. Single-Cycle Performance • Single-cycle critical path: • Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup • Typically, limiting paths are: • memory, ALU, register file • Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup

  35. Single-Cycle Performance Example Tc = ?

  36. Single-Cycle Performance Example Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup = [30 + 2(250) + 150 + 25 + 200 + 20] ps = 925 ps

  37. Single-Cycle Performance Example Program with 100 billion instructions: Execution Time = # instructions x CPI x TC = (100 × 109)(1)(925 × 10-12 s) = 92.5 seconds

  38. Single-cycle: + simple cycle time limited by longest instruction (lw) 2 adders/ALUs & 2 memories Multicycle: + higher clock speed + simpler instructions run faster + reuse expensive hardware on multiple cycles - sequencing overhead paid many times Same design steps: datapath & control Multicycle MIPS Processor

  39. Multicycle State Elements • Replace Instruction and Data memories with a single unified memory – more realistic

  40. MulticycleDatapath: Instruction Fetch STEP 1: Fetch instruction

  41. MulticycleDatapath: lwRegister Read STEP 2a:Read source operands from RF

  42. MulticycleDatapath: lw Immediate STEP 2b: Sign-extend the immediate

  43. MulticycleDatapath: lw Address STEP 3: Compute the memory address

  44. MulticycleDatapath: lw Memory Read STEP 4: Read data from memory

  45. MulticycleDatapath: lw Write Register STEP 5: Write data back to register file

  46. MulticycleDatapath: Increment PC STEP 6: Increment PC

  47. MulticycleDatapath: sw Write data in rt to memory

  48. MulticycleDatapath: R-Type • Read from rs and rt • Write ALUResult to register file • Write to rd (instead of rt)

  49. MulticycleDatapath: beq • rs == rt? • BTA = (sign-extended immediate << 2) + (PC+4)

  50. Multicycle Processor

More Related