340 likes | 367 Views
CSE246 Adder – Part I. Instructor: Prof. Chung-Kuan Cheng. Framework. Adder Design Specification Half/Full adder Carry ripple adder Adder Design Optimization Circuit level – Asynchronous adder, Manchester adder Logic level – carry look adder, Ling ’ s adder, etc …
E N D
CSE246Adder – Part I Instructor: Prof. Chung-Kuan Cheng
Framework • Adder Design Specification • Half/Full adder • Carry ripple adder • Adder Design Optimization • Circuit level – Asynchronous adder, Manchester adder • Logic level – carry look adder, Ling’s adder, etc… • Algorithm level – prefix adders • Generic parallel prefix adder optimization using dynamic programming • Zero-deficiency prefix adder • Function level – carry skip adder • Multi-operand Addition
Half Adder • Half Adder • “half” means no carry-in • Input: xi, yi Sum: si = xi⊕yi Carry out: ci+1 = xiyi • Notation: • ⊕ means logical XOR • + means logical OR • Juxtaposition means logical AND
Full Adder • Input: xi, yi and carry-in ci • Output • si = xi⊕yi⊕ci • ci+1 = xiyi + ci(xi+yi) = xiyi + ci(xi⊕yi)
Ripple Carry Adder x0 y0 c0/cin • Overflow flag = cn⊕cn-1 xn-1 yn-1 x1 y1 ci-1 . . . c1 c2 Cout/cn s1 s0 si-1
Understanding Carry Ripple Chain • Carry generation signal gi = xiyi • Carry propagation signal pi = xi⊕yi • Carry annihilation signal ai = (xi+yi)’ • Carry ripple in terms of p,g ci+1 = gi + pici • In practice, we might use ti = xi+yi = pi+gi and ci+1=gi+tici
g3 p3 g2 p2 C4 g1 p1 C1 Carry Ripple using (g,p) signals • Consider the (g p) chain • break the long paths
VDD gi ci ci pi ai Circuit level optimization • Manchester Adder – concept One and only one of gi, pi, and ai will be 1
Circuit Level Optimization • Manchester Adder – static logic implementation (gi)' ci+1 ci pi ai
CLK ci+1 ci pi ai CLK Circuit level optimization • Manchester adder – dynamic logic implementation • precharge in 1st half cycle • Evaluation in second half cycle evaluation precharge Q Time
Logic level optimization • Carry look ahead adder • Instead of generating carries bit-by-bit, try to look ahead to generate a group of consecutive carries simultaneously • Use logic manipulation to save hardware • Recursively unroll ci=gi-1+pi-1ci-1 ci=gi-1+pi-1gi-2+pi-1pi-2ci-2 ci=gi-1+pi-1gi-2+pi-1pi-2gi-3+pi-1pi-2pi-3gi-4+pi-1pi-2 pi-3pi-4ci-4
Logic level optimization • Ling’s adder • Notice gi=gipi ci =pi-1(gi-1+gi-2+pi-2gi-3+pi-2pi-3gi-4+pi-2 pi-3pi-4ci-4) • Use ti instead of pi ci=ti-1(gi-1+gi-2+ti-2gi-3+ti-2ti-3gi-4+ti-2ti-3ti-4ci-4) • Define the expression in parenthesis to be hi • ci =ti-1hi • hi = gi-1+gi-2+ti-2gi-3+ti-2ti-3gi-4+ti-2ti-3ti-4ti-5hi-4
Asynchronous Adder • Carry completion detection
Group (G,P) signals • Generating g[3:2] and g[3:2] g3 p3 g2 p2 g1 p1 C4 g3 p3 g2 p2 C1 g[3:2] p[3:2]
Group (G,P) signals • Generating g[1:0] and p[1:0] g3 p3 g2 p2 g1 p1 C4 g1 p1 cin cin g[1:0] p[1:0]
Group (G,P) signals • Generating g[3:0] and p[3:0] p3 g2 p2 g3 G[3:2] P[3:2] p[3:2] g[3:2] g[1:0] g1 p1 g[3:0] cin p[1:0] g[1:0] p[3:0] p[1:0]
Group (G,P) signals g4 + p4 ( g3 + p3 ( g2 + p2 ( g1 + p1 ( g0 + p0 cin ) ) ) ) g4 , p4 g3 , p3 g2 , p2 g1 , p1 g0 , p0 cin g4+p4g3 , p4p3 g2+p2g1 , p2p1 g0 , p0cin g4+p4g3+p4p3(g2+p2g1) , p4p3p2p1 g0 , p0cin g4+p4g3+p4p3(g2+p2g1)+(p4p3p2p1)g0 , (p4p3p2p1) p0cin g[4:3] p[4:3] g[2:1] p[2:1] g[4:1] p[4:1] cout=g[4:0] p[4:0]
Parallel Prefix Adder • What is parallel prefix problem? • How binary addition is modeled as a parallel prefix problem?
Parallel Prefix Problem (PPP) Given n inputs which can be either scalars or vectors, and an arbitrary associative operator •, compute the products for
Parallel Prefix Problem • Direct example is prefix sum problem • • is simply natural addition yi = xi+xi-1+…+x1 for • Partial sum • s[i:j]=xi+xj-1+…+xj (n≥j≥i≥1) • yn = s[n:1] = xn+xn-1+…+x1 • yn-1= s[n-1:1] = xn-1+xn-1+…+x1 • … • y2 = s[2:1] = x2+x1 • y1 = s[1:1] = x1
Binary addition as a PPP Addends: Sum: Carry generation signals: Carry propagation signals: Carry bits: Sum bits:
Binary Addition as a PPP Block carry generation signal Block carry propagation signal Introducing (P,G) operator The calculation of (P,G) pairs becomes a prefix problem
The General Prefix Adder Structure Pre-processing Prefix Processing Post-processing Parallel Prefix Adder Single bit (g,p) generator Feed through node Group (G,P) operator Final sum calculator
strictly leveled directed acyclic graph (DAG) of n columns Size = number of computation (black) nodes Depth = level of the latest output Prefix Adder: Graph Representation Serial Prefix Circuit
Prefix Adders: Conditional Sum Adder 8 7 6 5 4 3 2 1
Prefix Adders: size and depth • Objective: • Minimize # of nodes, sc(n). • Minimize depth, dc(n) • Tradeoff between size and depth • Ripple Carry Adder: • sc(8) = 7 • dc(8) = 7 • total = 14 • Conditional Sum Adder: • sc(8) = 12 • dc(8) = 3 • total = 15
Prefix Adders: size and depth • Minimum size = n-1, achieved by prefix adder • Minimum depth = ceil(log(n)), achieved by conditional sum adder • Given depth constraint, what is the minimum size?
Prefix Adders: Conditional Sum Adder • For output yi, there is an alphabetical tree covering inputs (xi, xi-1, …, x1) 8 7 6 5 4 3 2 1 • alphabetical tree: • Binary tree • Edges do not cross
Prefix Adders: Conditional Sum Adder • From input x1, there is a tree covering all outputs (yi, yi-1, …, y1) 8 7 6 5 4 3 2 1 • The nodes in this tree can be reduced to (g, p) o c = g+pc
Prefix Adders: size and depth • Theorem:sc(n)+dnc(n) >=sc(n)+dnc(n) >= 2n-2 • dnc(n) means the depth of the last output • Proof: • Alphabetical tree of yn contains n-1 internal nodes. • For each column where the prefix is not ready, at lease one extra node is needed, therefore we need at least n-(dnc(n) +1) extra nodes • sc(n) >=n-1+(n–(dnc(n)+1))=2n-2-dnc(n) • sc(n) + dnc(n) >= 2n-2
Zero-deficiency/depth-size optimal • Define the deficiency of a prefix circuit is as def = size + depth – (2n – 2) • A prefix circuit is said to be of zero-deficiency if its deficiency is zero • A prefix circuit is said to be depth-size optimal if it achieves minimum size under given depth requirement depth-size optimal Zero-deficiency
Zero-deficiency/depth-size optimal • The big picture What is the minimum depth of zero-deficiency circuits for a given width?
Prefix Adders: Brent – Kung Adder 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 • sc(16) = 26 • dc(16) = 6 • total = 32