230 likes | 392 Views
COMPUTER ARCHITECTURE. Processor: Single-Cycle Datapath. (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface , 3 rd Ed., Morgan Kaufmann, 2007 ). COURSE CONTENTS. Introduction Instructions Computer Arithmetic
E N D
COMPUTER ARCHITECTURE Processor: Single-Cycle Datapath (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3rd Ed., Morgan Kaufmann, 2007)
COURSE CONTENTS • Introduction • Instructions • Computer Arithmetic • Performance • Processor: Datapath • Processor: Control • Pipelining Techniques • Memory • Input/Output Devices
PROCESSOR: DATAPATH & CONTROL Elements of Datapath Single-Cycle Datapath
Subset of MIPS Instructions • We aim to design a processor to implement a subset of the core MIPS instruction set: • Memory-reference instructions: lw and sw • ALU instructions (R-type): add, sub, and, or, slt • Branch instruction beq and jump j
CPI Inst. Count Cycle Time Performance • Performance of a machine is determined by: • Instruction count • Clock cycle time • Clock cycles per instruction • Processor design (datapath and control) will affect: • Clock cycle time • Clock cycles per instruction
Instruction Execution Steps • lw/sw: fetch instruction, inc. PC (need inst. memory, PC, +4 adder) select registers (rs, rt) (need register file) calculate mem. address (need inst., sign ext, ALU) read/write memory (need data memory) write register (for lw) (need register file) • ALU (R): fetch instruction, inc. PC (need inst. memory, PC, +4 adder) select registers (rs, rt) (need register file) ALU operation on two data (need ALU) write registers (rd) (need register file) • Branch (beq): fetch instruction, inc. PC (need inst. memory, PC, +4 adder) select registers (rs, rt) (for beq) (need register file) test condition, cal. target address (need ALU, inst., sign ext, x4 shifter) update PC (need PC, adder) • Jump (j): fetch instruction, inc. PC (need inst. memory, PC, +4 adder) calculate target address (need inst., x4 shifter) update PC (need PC) • First step common to all, second step common to all (except jump)
Generic Implementation • Generic Implementation: • use the program counter (PC) to supply instruction address • get the instruction from memory • read registers • use the instruction to decide exactly what to do
M e m W r i t e I n s t r u c t i o n R e a d A d d r e s s a d d r e s s d a t a 1 6 3 2 S i g n e x t e n d P C D a t a W r i t e I n s t r u c t i o n A d d S u m m e m o r y d a t a Shifter I n s t r u c t i o n m e m o r y M e m R e a d b . I n s t r u c t i o n m e m o r y a . P r o g r a m c o u n t e r c . A d d e r g . D a t a m e m o r y u n i t d . S i g n - e x t e n s i o n u n i t h. Shifter A L U c o n t r o l 4 5 R e a d r e g i s t e r 1 R e a d d a t a 1 5 R e g i s t e r R e a d Z e r o r e g i s t e r 2 n u m b e r s R e g i s t e r s D a t a A L U A L U 5 W r i t e r e s u l t r e g i s t e r R e a d d a t a 2 W r i t e D a t a d a t a R e g W r i t e e . R e g i s t e r file f . A L U Resources Needed
A L U c o n t r o l 4 Z e r o A L U A L U r e s u l t f . A L U ALU • Note: NOR is needed for other parts of MIPS instruction set
R e a d r e g i s t e r R e a d n u m b e r 1 d a t a 1 R e a d r e g i s t e r R e a d r e g i s t e r n u m b e r 2 n u m b e r 1 R e g i s t e r f i l e R e g i s t e r 0 W r i t e r e g i s t e r R e g i s t e r 1 M R e a d u R e a d d a t a 1 d a t a 2 W r i t e x d a t a W r i t e R e g i s t e r n – 1 R e g i s t e r n R e a d r e g i s t e r n u m b e r 2 M u R e a d d a t a 2 x Register File (Read)
W r i t e C 0 R e g i s t e r 0 1 D n - t o - 1 C R e g i s t e r n u m b e r d e c o d e r R e g i s t e r 1 D n – 1 n C R e g i s t e r n – 1 D C R e g i s t e r n D R e g i s t e r d a t a Register File (Write) • Note: we still use the real clock to determine when to write
S t a t e S t a t e e l e m e n t C o m b i n a t i o n a l l o g i c e l e m e n t 1 2 C l o c k c y c l e Clocking Methodology • Defines when signals can be read and when they can be written • We assume edge-triggered clocking: values updated only on a clock edge • Diagram below: all signals must propagate from state element 1 through the combinational logic , and to state element 2 in one clock cycle • Time needed for signals to reach state element 2 defines length of clock cycle
D a t a R e g i s t e r # A d d r e s s P C I n s t r u c t i o n R e g i s t e r s A L U A d d r e s s R e g i s t e r # I n s t r u c t i o n D a t a m e m o r y m e m o r y R e g i s t e r # D a t a Abstract View of Datapath • Two types of functional units: • elements that operate on data values (combinational) e.g. ALU • elements that contain state (sequential) e.g. register, memory
Next Address Logic 4 Add Read address PC Instruction Inst memory Instruction Fetch Datapath
D a t a R e g i s t e r # A d d r e s s P C I n s t r u c t i o n R e g i s t e r s A L U A d d r e s s R e g i s t e r # I n s t r u c t i o n D a t a m e m o r y m e m o r y R e g i s t e r # D a t a R-type (ALU) Instructions • The datapath below works for R-type (ALU) instructions
D a t a R e g i s t e r # A d d r e s s P C I n s t r u c t i o n R e g i s t e r s A L U A d d r e s s R e g i s t e r # I n s t r u c t i o n D a t a m e m o r y m e m o r y R e g i s t e r # D a t a sign extend 16 32 Load/Store Instructions • We add a sign extender
Branch & Jump Instructions • We add next address logic. • Note for “jump”: • destination address = concatenating upper 4 bits of current PC+4 to the 26-bit address field in “jump” inst and adding 00 as the 2 lower bits)
PC + 4 M u PC + 4 [31-28] J addrr [31-0] 32 x A L U A d d r e s u l t S h i f t S h i f t 26 28 32 Inst [25-0] l e f t 2 l e f t 2 zero D a t a R e g i s t e r # A d d r e s s P C I n s t r u c t i o n R e g i s t e r s A L U A d d r e s s R e g i s t e r # I n s t r u c t i o n D a t a m e m o r y m e m o r y R e g i s t e r # D a t a sign extend 16 32 Branch & Jump Instructions
Building the Datapath • Use multiplexers (MUX) to stitch them together • Do not duplicate functional units common to different instructions • Add control signals (for MUX selection, ALU operation, state element read/write) • Independent operations can be in parallel
P C S r c M A d d u x 4 A L U A d d 4 26 S h i f t r e s u l t l e f t 2 S h i f t 28 32 l e f t 2 R e g i s t e r s A L U o p e r a t i o n 4 R e a d M e m W r i t e A L U S r c R e a d r e g i s t e r 1 P C R e a d a d d r e s s R e a d M e m t o R e g d a t a 1 Z e r o r e g i s t e r 2 I n s t r u c t i o n A L U A L U R e a d W r i t e R e a d A d d r e s s r e s u l t M d a t a r e g i s t e r d a t a 2 M u I n s t r u c t i o n u x W r i t e m e m o r y D a t a x d a t a m e m o r y W r i t e R e g W r i t e d a t a 3 2 16 S i g n M e m R e a d e x t e n d Building the Datapath
P C S r c M A d d u x 4 A L U A d d 4 26 S h i f t r e s u l t l e f t 2 S h i f t 28 32 l e f t 2 R e g i s t e r s A L U o p e r a t i o n 4 R e a d M e m W r i t e A L U S r c R e a d r e g i s t e r 1 P C R e a d a d d r e s s R e a d M e m t o R e g d a t a 1 Z e r o r e g i s t e r 2 I n s t r u c t i o n A L U A L U R e a d W r i t e R e a d A d d r e s s r e s u l t M d a t a r e g i s t e r d a t a 2 M u I n s t r u c t i o n u x W r i t e m e m o r y D a t a x d a t a m e m o r y W r i t e R e g W r i t e d a t a 3 2 16 S i g n M e m R e a d e x t e n d A Single-cycle Datapath • This datapath executes each basic instruction in a single clock cycle • No resource (functional unit) can be used more than once during a single cycle