270 likes | 340 Views
ECE 448 Lecture 1 3. Multipliers Timing Parameters. Required reading. S. Brown and Z. Vranesic , Fundamentals of Digital Logic with VHDL Design Chapter 10.2.3, Shift-and-Add Multiplier. Shift-and-Add Multiplier. An algorithm for multiplication. Decimal. Binary. 13. 1. 1. 0. 1.
E N D
ECE 448 Lecture 13 Multipliers Timing Parameters ECE 448 – FPGA and ASIC Design with VHDL
Required reading • S. Brown and Z. Vranesic,Fundamentals of Digital Logic with VHDL Design • Chapter 10.2.3, Shift-and-Add Multiplier ECE 448 – FPGA and ASIC Design with VHDL
Shift-and-Add Multiplier ECE 448 – FPGA and ASIC Design with VHDL
An algorithm for multiplication Decimal Binary 13 1 1 0 1 Multiplicand, A 11 1 0 1 1 Multiplier, B ´ ´ 13 1 1 0 1 13 1 1 0 1 0 0 0 0 143 1 1 0 1 1 0 001111 Product (a) Manual method P = 0 ; – i = 0 n 1 for to do b = 1 if then i P = P + A ; end if; A Left-shift ; end for; (b) Pseudo-code ECE 448 – FPGA and ASIC Design with VHDL
Expected behavior of the multiplier ECE 448 – FPGA and ASIC Design with VHDL
Datapath for the multiplier LA 0 DataA LB DataB n n n L L Shift-left Shift-right EA EB E E register register A B Clock n 2n + z b Sum 0 0 2n 2n 1 0 Psel 2n DataP EP E Register 2n P ECE 448 – FPGA and ASIC Design with VHDL
ASM chart for the multiplier Reset S1 ¬ P 0 Load A Load B 0 0 1 s s 1 S2 S3 Shift left A , Shift right B Done 1 ¬ B = 0 ? P P + A 0 0 b 0 1 ECE 448 – FPGA and ASIC Design with VHDL
ASM chart for the multiplier control circuit ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (1) LIBRARY ieee ; USE ieee.std_logic_1164.all ; USE ieee.std_logic_unsigned.all ; USE work.components.all ; ENTITY multiply IS GENERIC ( N : INTEGER := 8; NN : INTEGER := 16 ) ; PORT ( Clock : IN STD_LOGIC ; Resetn : IN STD_LOGIC ; LA, LB, s : IN STD_LOGIC ; DataA : IN STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; DataB : IN STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; P : OUT STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; Done : OUT STD_LOGIC ) ; END multiply ; ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (2) ARCHITECTURE Behavior OF multiply IS TYPE State_type IS ( S1, S2, S3 ) ; SIGNAL y : State_type ; SIGNAL Psel, z, EA, EB, EP, Zero : STD_LOGIC ; SIGNAL PF, B, N_Zeros : STD_LOGIC_VECTOR(N–1 DOWNTO 0) ; SIGNAL A, Ain, DataP, Sum : STD_LOGIC_VECTOR(NN–1 DOWNTO 0) ; BEGIN FSM_transitions: PROCESS ( Resetn, Clock ) BEGIN IF Resetn = '0’ THEN y <= S1 ; ELSIF (Clock'EVENT AND Clock = '1') THEN CASE y IS WHEN S1 => IF s = '0' THEN y <= S1 ; ELSE y <= S2 ; END IF ; WHEN S2 => IF z = '0' THEN y <= S2 ; ELSE y <= S3 ; END IF ; WHEN S3 => IF s = '1' THEN y <= S3 ; ELSE y <= S1 ; END IF ; END CASE ; END IF ; END PROCESS ; ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (3) FSM_outputs: PROCESS ( y, s, B(0) ) BEGIN EP <= '0' ; EA <= '0' ; EB <= '0' ; Done <= '0' ; Psel <= '0'; CASE y IS WHEN S1 => EP <= '1‘ ; WHEN S2 => EA <= '1' ; EB <= '1' ; Psel <= '1‘ ; IF B(0) = '1' THEN EP <= '1' ; ELSE EP <= '0' ; END IF ; WHEN S3 => Done <= '1‘ ; END CASE ; END PROCESS ; ECE 448 – FPGA and ASIC Design with VHDL
Datapath for the multiplier LA 0 DataA LB DataB n n n L L Shift-left Shift-right EA EB E E register register A B Clock n 2n + z b Sum 0 0 2n 2n 1 0 Psel 2n DataP EP E Register 2n P ECE 448 – FPGA and ASIC Design with VHDL
VHDL code of multiplier circuit (4) - - Define the datapath circuit Zero <= '0' ; N_Zeros <= (OTHERS => '0' ) ; Ain <= N_Zeros & DataA ; ShiftA: shiftlne GENERIC MAP ( N => NN ) PORT MAP ( Ain, LA, EA, Zero, Clock, A ) ; ShiftB: shiftrne GENERIC MAP ( N => N ) PORT MAP ( DataB, LB, EB, Zero, Clock, B ) ; z <= '1' WHEN B = N_Zeros ELSE '0' ; Sum <= A + PF ; P <= PF; - - Define the 2n 2-to-1 multiplexers for DataP GenMUX: FOR i IN 0 TO NN–1 GENERATE Muxi: mux2to1 PORT MAP ( Zero, Sum(i), Psel, DataP(i) ) ; END GENERATE; RegP: regne GENERIC MAP ( N => NN ) PORT MAP ( DataP, Resetn, EP, Clock, PF ) ; END Behavior ; ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier ECE 448 – FPGA and ASIC Design with VHDL
Notation a Multiplicand ak-1ak-2 . . . a1 a0 x Multiplier xk-1xk-2 . . . x1 x0 p Product (a x) p2k-1p2k-2 . . . p2 p1 p0 ECE 448 – FPGA and ASIC Design with VHDL
Unsigned Multiplication a4 a3 a2 a1 a0 x4 x3 x2 x1 x0 x ax0 20 a4x0 a3x0 a2x0 a1x0 a0x0 ax1 21 a4x1 a3x1 a2x1 a1x1 a0x1 + ax2 22 a4x2 a3x2 a2x2 a1x2 a0x2 ax3 23 a4x3 a3x3 a2x3 a1x3 a0x3 a4x4 a3x4 a2x4 a1x4 a0x4 ax4 24 p1 p0 p9 p5 p7 p3 p2 p8 p6 p4 ECE 448 – FPGA and ASIC Design with VHDL
5 x 5 Array Multiplier ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier - Basic Cell cin x FA y cout s ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier – Modified Basic Cell am ci si-1 xn FA ci+1 si ECE 448 – FPGA and ASIC Design with VHDL
5 x 5 Array Multiplier with modified cells ECE 448 – FPGA and ASIC Design with VHDL
Pipelined 5 x 5 Multiplier ECE 448 – FPGA and ASIC Design with VHDL
Array Multiplier – Modified Basic Cell am ci si-1 xn FA ci+1 si Flip-flops ECE 448 – FPGA and ASIC Design with VHDL
Timing parameters units definition time from pointpoint ns delay rising edge rising edge of clock ns clock period T 1 MHz clock frequency clock period ns latency time from inputoutput throughput Mbits/s #output bits/time unit ECE 448 – FPGA and ASIC Design with VHDL
Latency is the time between input(n) and output(n) i.e. time it takes from first input to first output, second input to second output, etc. Latency is usually constant for a system (but not always) Also called input-to-output latency Count the number of rising edges of the clock! In this example, 2 rising edges from input to output latency is 2 cycles Latency is measured in clock cycles and then translated to units of time (nanoseconds) In this example, say clock period is 10 ns, then latency is 20 ns Latency D D D Q Q Q top-level entity 8 bits 8 bits input CombinationalLogic CombinationalLogic output clk clk clk 100 MHz clk input(1) input(2) input(0) input (unknown) output(0) output(1) output ECE 448 – FPGA and ASIC Design with VHDL
Throughput D D D Q Q Q top-level entity 8 bits • Throughput = (bits per output sample) / (time between consecutive output samples) • Bits per output sample: • In this example, 8 bits per output sample • Time between consecutive output samples: clock cycles between output(n) to output(n+1) • Can be measured in clock cycles, then translated to time • In this example, time between consecutive output samples = 1 clock cycle = 10 ns • Throughput = (8 bits per output sample) / (10 ns) = 0.8 bits / ns = 800 Mbits/s 8 bits input CombinationalLogic CombinationalLogic output clk clk clk clk input(1) input(2) input(0) input (unknown) output(0) output(1) output 1 cycle betweeenoutput samples ECE 448 – FPGA and ASIC Design with VHDL
Pipelining—Conceptual D D D Q Q Q CombinationalLogic • Purpose of pipelining is to reduce the critical path of the circuit by inserting an additional register (called a pipeline register) • This splits the combinational logic in half • Now critical path delay is 5 ns, so maximum clock frequency is 200 MHz • Double the clock frequency • Area is increased due to additional register • In general, pipelining increases throughput at the cost of increased area/power and a minor increase in latency register splits logic in half CombinationalLogic A CombinationalLogic A clk clk clk tLOGICB = 5 ns tLOGICA = 5 ns ECE 448 – FPGA and ASIC Design with VHDL