Chapter 5: Datapath and Control (Part 3)

Chapter 5: Datapath and Control(Part 3) CS 447 Jason Bakos

Single-Cycle CPU • CPI of the single cycle CPU from the last lecture had a CPI of 1 • Clock cycle is determined by the longest possible path in the machine • loads are the worst – they use 5 functional units in series • Performance, utilization, and efficiency are not going to be good, because most instructions don’t need such a long clock cycle • A variable-speed clock could be used to solve this problem, but hinders parallelism • Pipelining overlaps instruction executions

Multicycle Implementation • Break instructions into steps, where each step requires one clock cycle • We want to reuse functional units within an instruction instead of just across instructions • Reduces hardware • Use single memory for instructions and data • Single ALU instead of one ALU and two adders • Add registers to functional units to hold intermediate results (state data) for future cycles • Use within instruction executions • Register file and memory hold state data to be used across instruction executions • These are programmer-visible • We will need a FSM to control CPU

Registers • Locations of registers is determined by the following: • What combinatorial units will fit in one clock cycles • Assume memory access, regfile access (two reads or one write), or ALU operation • Any data needed by these operations must be stored in a temporary register • Instruction Register, Memory Data Register, A, B, and ALUOut registers added to design • All these except IR only need to hold data between two adjacent clock cycles • What data are needed in later cycles implementing the instruction

Multiplexors • Need to add extra multiplexors (or expand existing muxes) to facilitate the reuse of the ALU within instructions • Add mux to first ALU input • Expand mux to second ALU input

Multicycle CPU

Breaking Instruction Execution into Clock Cycles • Goal is to balance the latency of the operations performed during each clock cycle • At most one of the following can occur in series: • One ALU operation • One register file access (or multiple in parallel) • One memory access (this is a joke, but we’ll accept this for now)

Execution Stages • In order to clearly define the CPU operation for each step in the operation, we’ll use RTL (register transfer language) • Architecture research has defined 5 standard phases of instruction execution • Instruction fetch • Decode • Fetch register values from register file • Execute • Perform arithmetic/logic operation • Memory • Load/Store memory • Write back • Write register result back to register file

Execution Stages • Fetch • IR=Memory[PC] • PC=PC+4 • Decode • A=Reg[IR[25..21]] • B=Reg[IR[20..16]] • ALUOut=PC+(sign_extend(IR[15..0]) << 2

Execution Stages • Execute • Memory access • ALUOut=A+sign_extend(IR[15..0]) • R-type • ALUOut=A op B • Branch (beq) • if (A==B) PC=ALUOut • PC=PC[31..28] || (IR[25..0]<<2)

Execution Stages • Memory Access/Write Back • Load • MDR=Memory[ALUOut] • Store • Memory[ALUOut]=B • R-type • Reg[IR[15..11]]=ALUOut • Memory Read Completion • Load • Reg[IR[20..16]]=MDR

Control Signals • Control Unit signals • Refer to figure 5.34 (pg. 384) in the book • ALU Control signals • Provide an appropriate ALUOp signal based on what the ALU is being used for (if for an R-type, perform lookup based on function code)

Control Signals • All that’s left is for us to build the control unit as a FSM and the ALU control as a lookup table

Control Unit • The fetch and decode stages are the same for every instruction...

Control Unit • Here’s the states and transitions for the memory-reference instructions

Control Unit • Here’s the states and transitions for R-type, branch, and jump instructions

Control Unit • Final control unit FSM...

Problems to Think About • How could we add bne, blt, and bgez instructions to our CPU? • Do do you calculate CPI for our CPU if we are given instruction-type distributions?

Chapter 5: Datapath and Control (Part 3)

Chapter 5: Datapath and Control (Part 3)

Presentation Transcript

Chapter 14

Chapter 6 Web Typography

Chapter 6: Birth Control

Fire Control

Chapter 7: Flexible Budgets, Variances, and Management Control: I

Unit 3: Control Systems of the Human Body

Chapter 23

EMPLOYMENT ACT 1955

Chapter 4: Access Control

Chapter 5

Pipelined Datapath

Chapter 6, Process Synchronization, Overheads, Part 1

Chapter 22: Distributed Databases

Processes

Chapter 3

Thermochemistry

Chapter 5: Other Relational Languages

Cache Memory

CS4100: 計算機結構 Pipelining

Part 2: Advanced Static Analysis