1 / 26

Processor Design and Instruction Execution Overview

Understanding the CPU performance factors such as instruction count, CPI, and cycle time. Learn the steps of processor design and overview of instruction execution in a computer system. Includes logic design conventions, MIPS instruction subset, and execution cycle details. Processor organization overview with a focus on datapath components and control logic. Introduction to memory reference, arithmetic, logical operations, and control transfer in processor design. Exploring the performance perspective and components influencing clock cycle time and CPI. Processor design steps and implementation considerations covered in depth.

randolphc
Download Presentation

Processor Design and Instruction Execution Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer OrganizationCS224 Fall 2012 Lesson 22

  2. The Big Picture The Five Classic Components of a Computer Chapter 4 Topic: Processor Design Processor Input Control Memory Datapath Output

  3. Introduction §4.1 Introduction • CPU performance factors • Instruction count • Determined by ISA and compiler • CPI and Cycle time • Determined by CPU hardware • We will examine two MIPS implementations • A simplified version • A more realistic pipelined version • Simple subset, shows most aspects • Memory reference: lw, sw • Arithmetic/logical: add, sub, and, ori, slt • Control transfer: beq, j

  4. The Performance Perspective Performance of a machine is determined by: Instruction count Clock cycle time Clock cycles per instruction Processor design (datapath and control) will determine: Clock cycle time--CCT Clock cycles per instruction--CPI This week: Single cycle processor (datapath + control) Advantage: One clock cycle per instruction Disadvantage: long cycle time CPI Inst. Count Cycle Time

  5. Processor Design Steps 1. Analyze instruction set => datapath requirements the meaning of each instruction is given by the register transfers (ISA model => RTL model) datapath must include storage element for ISA registers possibly more datapath must support each register transfer 2. Select set of datapath components and establish clocking methodology 3. Assemble datapath meeting the RTL requirements

  6. Processor Design (cont’d) 4. Analyze implementation of each instruction to determine setting of control points that effect the register transfer. 5. Assemble the control logic 6. RTL datapath and control design are refined to track physical design and functional validation Changes made for timing and errata (a.k.a. “bug”) fixes Amount of work varies with capabilities of CAD tools and degree of optimization for cost/performance

  7. Subset of Instructions To simplify our study of processor design, we will focus on a subset of the MIPS instructions Memory: lw and sw Arithmetic: add, sub, and, ori, and slt Branch: beq and j Example in lecture uses ori rather than or covered in text, to demonstrate one more category of instructions The method of implementing other instructions should come naturally from these

  8. MIPS Format Review R-Format add rd, rs, rt sub rd, rs, rt Bits 6 5 5 5 5 6 OP=0 rs rd sa funct rt function code second sourceregister first sourceregister resultregister shift amount

  9. MIPS Format Review (cont) I-Format lw rt, rs, imm sw rt, rs, imm beq rs, rt, imm ori rt, rs, imm Reminders Branch uses PC Relative addressing (PC + 4 + 4 × imm) Bits 6 5 5 16 OP rs imm rt immediate second sourceregister first sourceregister

  10. MIPS Format Review (cont) J-Format j target Reminders Uses pseudodirect addressing (target × 4) to allow addressing 228 bits directly Uses top 4 bits from PC Bits 6 26 OP target jump target address

  11. Execution Cycle Instruction Fetch Obtain instruction from program storage Instruction Decode Determine required actions and instruction size Locate and obtain operand data Operand Fetch Compute result value or status Execute Result Store Deposit results in storage for later use Next Instruction Determine successor instruction

  12. What Happens? It’s hard to see how we should go about organizing the processor To start thinking about it, look at what happens on each instruction The instruction specified by the PC is fetched from memory One or two registers are read (lw vs. add for instance) The ALU must be used to add, subtract, etc. The results are stored (to memory or a register)

  13. Instruction Execution • PC  instruction memory, fetch instruction • Register numbers register file, read registers • Depending on instruction class • Use ALU to calculate • Arithmetic result • Memory address for load/store • Branch target address • Access data memory for load/store • PC  target address or PC + 4

  14. Processor Overview • Data flows through memory and functional units

  15. Multiplexers • Can’t just join wires together • Use multiplexers

  16. Control

  17. Logic Design Basics • Information encoded in binary • Low voltage = 0, High voltage = 1 • One wire per bit • Multi-bit data encoded on multi-wire buses • Combinational element • Operate on data • Output is a function of input • Example: ALU • State (sequential) elements • Store information or state • Example: Register File §4.2 Logic Design Conventions

  18. 1 bit ALU Using a MUX we can add the AND, OR, and adder operations into a single ALU Cin ALUOp A Result Mux 1-bit Full Adder B Cout

  19. 4 bit ALU ALUop ALUop CIn0 3 A0 1-bit ALU A Result0 4 B0 COut0 CIn1 A1 1-bit ALU Result1 B1 COut1 CIn2 A2 1-bit ALU Result2 B2 COut2 CIn3 A3 1-bit ALU Result3 COut3 B3 B 4 COut3

  20. Combinational Elements Select Carry_In A 32 A 32 Adder Sum 32 32 MUX Y 32 B Carry B 32 Adder MUX OP A 32 ALU Result 32 B Zero 32 ALU

  21. D Latches Modified SR Latch Latches value when C is asserted C Q Q D

  22. D Flip Flop Uses Master/Slave D Latches Q D Q Q D D D Latch D Latch Q C C Q Q CLK

  23. Storage Element: Register Register Similar to D Flip Flop N bit input and output Write Enable input Write Enable 0: Data Out will not change 1: Data Out will become Data In Data changes only on falling edge! Write Enable Data In Data Out N N Clk

  24. Storage Element: Reg File Register File consists of 32 registers Two 32 bit output busses busA and busB One 32 bit input bus busW Register 0 hard wired to value 0 Register selected by RA selects register to put on busA RB selects register to put on busB RW selects register to be written via busW when Write Enable is 1 Clock input (CLK) CLK input is a factor only for write operation During read, behaves as combinational logic block RA or RB stable  busA or busB valid after “access time” Minor simplification of reality RW RA RB Write Enable 5 5 5 busA busW 32 32 32 32-bit Registers busB Clk 32

  25. Storage Element: Memory Memory One input bus: Data In One output bus: Data Out Address selection Address selects the word to put on Data Out To write to address, set Write Enable to 1 Clock input (CLK) CLK input is a factor only for write operation During read, behaves as combinational logic block Valid Address  Data Out valid after “access time” Minor simplification of reality Address Write Enable Data In Data Out 32 32 Clk

  26. Some Logic Design… All storage elements have same clock Edge-triggered clocking “Instantaneous” state change (simplification!) Timing always work if the clock is slow enough Cycle Time = Clk-to-Q + Longest Delay + Setup + Clock Skew . . . . . . . . . . . . Clk Setup Hold Setup Hold Don’t Care

More Related