Computer Architecture: A Constructive Approach Sequential Circuits Arvind

Computer Architecture: A Constructive Approach Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology http://csg.csail.mit.edu/SNU

OpSelect - Add, Sub, ... - And, Or, Xor, Not, ... - GT, LT, EQ, Zero, ... Sel Sel lg(n) lg(n) O0 O1 On-1 O0 O1 On-1 A0 A1 An-1 A O A Result A Demux Decoder ALU Mux lg(n) Comp? . . . B . . . . . . Combinational circuits Such circuits have no cycles (feedback) or state

A simple synchronous state element Edge-Triggered Flip-flop D Q ff C C D Q Metastability Data is sampled at the rising edge of the clock

EN D Q ff C C EN D Q Flip-flops with Write Enables D Q ff EN C dangerous! EN 0 1 Q D C ff Data is captured only if EN is on

D D D D D D D D En ff ff ff ff ff ff ff ff C Q Q Q Q Q Q Q Q Registers Register: A group of flip-flops with a common clock and enable Register file: A group of registers with a common clock, input and output port(s)

Clock WE ReadSel1 ReadData1 Register file 2R + 1W ReadSel2 ReadData2 WriteSel WriteData RSel1 WSel WData C RData1 register 0 RSel2 WE register 1 RData2 Register Files No timing issues in reading a selected register

WE ReadSel1 ReadData1 Register file 2R + 1W ReadSel2 ReadData2 WriteSel WriteData WE ReadSel ReadData Register file 1R + 1R/W R/WSel R/WData Register Files and Ports Ports were expensive multiplex a port for read & write

We can build useful and compact circuits using registers Example: Multiplication by repeated addition http://csg.csail.mit.edu/SNU

Multiplication by repeated addition a0 m0 b Multiplicand a Muliplier * 1101 (13) 1011 (11) 1101 + 1101 + 0000 + 1101 10001111 (143) a1 m1 0 add4 a2 m2 add4 a3 m3 mi = (a[i]==0)? 0 : b; add4 http://csg.csail.mit.edu/SNU

Combinational 32-bit multiply function Bit#(64) mul32(Bit#(32) a, Bit#(32) b); Bit#(32) prod = 0; Bit#(32) tp = 0;for(Integer i = 0; i < 32; i = i+1)begin Bit#(32) m = (a[i]==0)? 0 : b; Bit#(33) sum = add32(m,tp,0); prod[i] = sum[0];tp = truncateLSB(sum); endreturn {tp,prod};endfunction http://csg.csail.mit.edu/SNU

Combinational n-bit multiply function Bit#(w+w) mulN(Bit#(w) a, Bit#(w) b); Bit#(w) prod = 0; Bit#(w) tp = 0;for(Integer i = 0; i < w; i = i+1)begin Bit#(w) m = (a[i]==0)? 0 : b; Bit#(w+1) sum = addN(m,tp,0); prod[i] = sum[0];tp = truncateLSB(sum); endreturn {tp,prod};endfunction What is wrong with this code? http://csg.csail.mit.edu/SNU

Combinational n-bit multiply function Bit#(Tadd#(w,w)) mulN(Bit#(w) a, Bit#(w) b); Bit#(w) prod = 0; Bit#(w) tp = 0;for(Integer i = 0; i < valueOf(w); i = i+1)begin Bit#(w) m = (a[i]==0)? 0 : b; Bit#(Tadd#(w,1))sum = addN(m,tp,0); prod[i] = sum[0];tp = truncateLSB(sum); endreturn {tp,prod};endfunction http://csg.csail.mit.edu/SNU

Design issues with combinational multiply • Lot of hardware • 32-bit multiply uses 31 addN circuits • Long chains of gates • 32-bit ripple carry adder has a 31 long chain of gates • 32-bit multiply has 31 ripple carry adders in sequence! The speed of a combinational circuit is determined by its longest input-to-output path http://csg.csail.mit.edu/SNU

Sequential Multiply y i b a mul_en & tp prod << add 0 0 << x == 32 done result (high) result (low) http://csg.csail.mit.edu/SNU

Sequential multiply Reg#(Bit#(32)) a <- mkRegU(); Reg#(Bit#(32)) b <- mkRegU(); Reg#(Bit#(32)) prod <-mkRegU();Reg#(Bit#(32)) tp <- mkRegU(); Reg#(Bit#(6)) i <- mkReg(32); rule mulStep if (i < 32); Bit#(32) m = (a[i]==0)? 0 : b; Bit#(33) sum = add32(m,tp,0); prod[i] <= sum[0];tp <= truncateLSB(sum); i <= i+1;endrule state elements a rule to describe dynamic behavior http://csg.csail.mit.edu/SNU

Tadd#(t,t) t t t t t Int#(32) Int#(32) start enab rdy Multiply module Int#(64) rdy #(Numeric type t) result A Multiply Module tcould be Int#(32), Int#(13), ... • The module can easily be made polymorphic implicit conditions interfaceMultiply; methodAction start (Int#(32) a, Int#(32) b); methodInt#(64) result(); endinterface Many different implementations can provide the same interface: module mkMultiply (Multiply) http://csg.csail.mit.edu/6.375

State External interface Multiply Module module mkMultiply32 (Multiply32); … Reg#(Bit#(32)) prod <-mkRegU();Reg#(Bit#(32)) tp <- mkRegU(); rule mulStep if (i < 32); … prod[i]<= sum[0]; tp<= truncateLSB(sum); endrule method Action start(Bit#(32) aIn, Bit#(32) bIn) if (i == 32); a <= aIn; b <= bIn; tp <= 0; prod <= 0; i <= 0; endmethod methodBit#(64) result() if (i == 32); return {tp,prod}; endmethodendmodule Internal behavior http://csg.csail.mit.edu/SNU

Sequential 32-bit multiply module mkMultiply32 (Multiply32); Reg#(Bit#(32)) a <- mkRegU(); Reg#(Bit#(32)) b <- mkRegU(); Reg#(Bit#(32)) prod <-mkRegU();Reg#(Bit#(32)) tp <- mkRegU(); Reg#(Bit#(6)) i <- mkReg(32);rule mulStep if (i < 32); Bit#(32) m = (a[i]==0)? 0 : b; Bit#(33) sum = add32(m,tp,0); prod[i] <= sum[0]; tp <= truncateLSB(sum);i <= i+1;endrule method Action start(Bit#(32) aIn, Bit#(32) bIn) if (i == 32); a <= aIn; b <= bIn; tp <= 0; prod <= 0; i <= 0; endmethod method Bit#(64) result() if (i == 32); return {tp,prod}; endmethodendmodule http://csg.csail.mit.edu/SNU

Sequential n-bit multiply modulemkMultiplyN(MultiplyN); Reg#(Bit#(w)) a <- mkRegU(); Reg#(Bit#(w)) b <- mkRegU(); Reg#(Bit#(w)) prod <-mkRegU();Reg#(Bit#(w)) tp <- mkRegU(); Reg#(Bit#(Add#(Tlog(w),1)) i <- mkReg(w);rule mulStep if (i < valueOf(w)); Bit#(w) m = (a[i]==0)? 0 : b; Bit#(Tadd#(w,1)) sum = addN(m,tp,0); prod[i] <= sum[0];tp<= truncateLSB(sum);i <= i+1;endrule method Action start(Bit#(w) aIn, Bit#(w) bIn) if (i== valueOf(w)); a <= aIn; b <= bIn; tp <= 0; prod <= 0; i <= 0; endmethod method Bit#(Tadd#(w,w)) result() if (i== valueOf(w)); return {tp,prod}; endmethodendmodule http://csg.csail.mit.edu/SNU

x y en rdy start start_en y_en x_en start_en x y x rdy result !(=0) > swap? subtract? Module: Method Interface sub Not complete (taken from the GCD example) http://csg.csail.mit.edu/6.375

Sequential Multiply Module y i a b start_rdy start_en tp prod add & 0 << 0 -1 << x != 32 result (high) result (low) result_rdy Not complete http://csg.csail.mit.edu/SNU

Pipelined Barrel Shifter • By inserting a register we can shorten the combinational critical path 0 0 0 0 0 0 Lab exercise 2 http://csg.csail.mit.edu/SNU

Computer Architecture: A Constructive Approach Sequential Circuits Arvind