16.482 / 16.561 Computer Architecture and Design

16.482 / 16.561Computer Architecture and Design Instructor: Dr. Michael Geiger Fall 2013 Lecture 4: Datapath and control introduction

Lecture outline • Announcements/reminders • HW 3 due today • HW 4 to be posted; due 10/7 ... or 10/16 • Review: Arithmetic for computers • Integer multiplication/division • Floating point representations • Floating point arithmetic • Today’s lecture: Processor datapath and control discussion Computer Architecture Lecture 4

Review: Binary multiplication • Generate shifted partial products and add them • Hardware can be condensed to two registers • N-bit multiplicand • 2N-bit running product / multiplier • At each step • Check LSB of multiplier • Add multiplicand/0 to left half of product/multiplier • Shift product/multiplier right • Signed multiplication: Booth’s algorithm • Add extra bit to left of all regs, right of prod./multiplier • At each step • Check two rightmost bits of prod./multiplier • Add multiplicand, -multiplicand, or 0 to left half of prod./multiplier • Shift product multiplier right • Discard extra bits to get final product CIS 273: Lecture 12

Review: IEEE Floating-Point Format • S: sign bit (0  non-negative, 1  negative) • Normalize significand: 1.0 ≤ |significand| < 2.0 • Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit) • Significand is Fraction with the “1.” restored • Actual exponent = (encoded value) - bias • Ensures exponent is unsigned • Single: Bias = 127; Double: Bias = 1023 • Encoded exponents 0 and 111 ... 111 reserved single: 8 bitsdouble: 11 bits single: 23 bitsdouble: 52 bits S Exponent Fraction Computer Architecture Lecture 3

Datapath & control intro • Recall: How does a processor execute an instruction? • Fetch the instruction from memory • Decode the instruction • Determine addresses for operands • Fetch operands • Execute instruction • Store result (and go back to step 1 … ) • First two steps are the same for all instructions • Next steps are instruction-dependent Computer Architecture Lecture 4

Basic processor implementation • Shows datapath elements: logic blocks that operate on or store data • Need control signals to specify multiplexer selection, functional unit operations Computer Architecture Lecture 4

Implementation issues: datapath / control • Basic implementation shows datapath elements: logic blocks that either • Operate on data (ALU) • Store data (registers, memory) • Need some way to determine • What data to use • What operations to perform • Control signals: signals used for • Multiplexer selection • Specifying functional unit operation Computer Architecture Lecture 4

Design choices • Single-cycle implementation • Every instruction takes one clock cycle • Instruction fetch, decoding, execution, memory access, and result writeback in same cycle • Cycle time determined by longest instruction • Will discuss first as a means for simply introducing the different hardware required to execute each instruction • Pipelined implementation • Simultaneously execute multiple instructions in different stages of execution Computer Architecture Lecture 4

MIPS subset • We’ll go through simple processor design for subset of MIPS ISA • add, sub, and, or (also immediate versions) • lw, sw • slt • beq, j • Datapath design • What elements are necessary (and which ones can be re-used for multiple instructions)? • Control design • How do we get the datapath elements to perform the desired operations? • Will then discuss other operations Computer Architecture Lecture 4

Datapath design • What steps in the execution sequence do all instructions have in common? • Fetch the instruction from memory • Decode the instruction • Instruction fetch brings data in from memory • Decoding determines control over datapath Computer Architecture Lecture 4

Instruction fetch • What logic is necessary for instruction fetch? • Instruction storage  Memory • Register to hold instruction address  program counter • Logic to generate next instruction address • Sequential code is easy: PC + 4 • Jumps/branches more complex (we’ll discuss later) Computer Architecture Lecture 4

Instruction fetch (cont.) • Loaderplaces starting address of main program in PC • PC + 4 points to next sequential address in main memory Computer Architecture Lecture 4

Executing MIPS instructions • Look at different classes within our subset (ignore immediates for now) • add, sub, and, or • lw, sw • slt • beq, j • 2 things processor does for all instructions except j • Reads operands from register file • Performs an ALU operation • 1 additional thing for everything but sw, beq, and j • Processor writes a result into register file Computer Architecture Lecture 4

Instruction decoding • Generate control signals from instruction bits • Recall: MIPS instruction formats • Register instructions: R-type • Immediate instructions: I-type • Jump instructions: J-type Computer Architecture Lecture 4

R-type execution datapath • R-type instructions (in our subset) • add, sub, and, or, slt • What logic is necessary? • Register file to provide operands • ALU to operate on data Computer Architecture Lecture 4

Memory instructions • lw, sw I-type instructions • Use register file • Base address • Source data for store • Destination for load • Compute address using ALU • What additional logic is necessary? • Data memory • Sign extension logic (why?) Computer Architecture Lecture 4

Branch instructions • Simple beq implementation • Branch if (a – b) == 0; use ALU to evaluate condition • Branch address = (PC + 4) + (offset << 2) • Where’s offset encoded? • Immediate field • Why shift by 2? • Instruction addresses word aligned—2 least significant bits are 0 Computer Architecture Lecture 4

Branch instruction datapath Computer Architecture Lecture 4

Putting the pieces together Chooses PC+4 or branch target Chooses ALU output or memory output Chooses register or sign-extended immediate Computer Architecture Lecture 4

Datapath for R-type instructions EXAMPLE: add $4, $10, $30 ($4 = $10 + $30) Computer Architecture Lecture 4

Datapath for I-type ALU instructions EXAMPLE: addi $4, $10, 15 ($4 = $10 + 15) Computer Architecture Lecture 4

Datapath for beq (not taken) EXAMPLE: beq $1,$2,label (branch to label if $1 == $2) Computer Architecture Lecture 4

Datapath for beq (taken) EXAMPLE: beq $1,$2,label (branch to label if $1 == $2) Computer Architecture Lecture 4

Datapath for lw instruction EXAMPLE: lw $2, 10($3) ($2 = mem[$3 + 10]) Computer Architecture Lecture 4

Datapath for sw instruction EXAMPLE: sw $2, 10($3) (mem[$3 + 10] = $2) Computer Architecture Lecture 4

0 4 35 or 43 rs rs rs rt rt rt rd address address shamt funct 31:26 31:26 31:26 25:21 25:21 25:21 20:16 20:16 20:16 15:11 10:6 15:0 15:0 5:0 The Main Control Unit • Control signals derived from instruction R-type Load/Store Branch opcode always read read, except for load write for R-type and load sign-extend and add Computer Architecture Lecture 4

Datapath With Control Computer Architecture Lecture 4

ALU control • Two steps to generate control inputs • Use opcode to choose between R-type (and, or, add, sub, slt), lw / sw, and beq • If R-type, use funct field to determine ALU control inputs Computer Architecture Lecture 4

ALU control (continued) • Opcode generates 2-bit signal (ALUOp) Computer Architecture Lecture 4

Adding j instruction • J-type format • Next PC = (PC+4)[31:28] combined with (target<<2) • 3 choices for next PC • PC + 4 • beq target address • j target address • Need another multiplexer and control bit! Computer Architecture Lecture 4

Datapath with jump support Computer Architecture Lecture 4

Control signals • PCWriteCond: conditionally enables write to PC if branch condition is true (if Zero flag = 1) • PCWrite: enables write to PC; used for both normal PC increment and jump instructions • IorD: indicates if memory address is for instruction (i.e., PC) or data (i.e., ALU output) • MemRead, MemWrite: controls memory operation • MemtoReg: determines if write data comes from memory or ALU output • IRWrite: enables writing of instruction register Computer Architecture Lecture 4

Control signals (cont.) • PCSource: controls multiplexer that determines if value written to PC comes from ALU (PC = PC+4), branch target (stored ALU output), or jump target • ALUOp: feeds ALU control module to determine ALU operation • ALUSrcA: controls multiplexer that chooses between PC or register as A input of ALU • ALUSrcB: controls multiplexter that chooses between register, constant ‘4’, sign-extended immediate, or shifted immediate as B input of ALU • RegWrite: enables write to register file • RegDst: determines if rt or rd field indicates destination register Computer Architecture Lecture 4

Summary of execution phases Computer Architecture Lecture 4

Final notes • Next time: • Pipelining • Announcements/reminders • HW 3 due today • HW 4 to be posted; due date TBD Computer Architecture Lecture 4

16.482 / 16.561 Computer Architecture and Design

16.482 / 16.561 Computer Architecture and Design

Presentation Transcript