1 / 56

CSE 246: Computer Arithmetic Algorithms and Hardware Design

CSE 246: Computer Arithmetic Algorithms and Hardware Design. Lecture 4: Adders. Instructor: Prof. Chung-Kuan Cheng. Topics:. Adders AND/OR gate v.s. Circuit Logic Design Graph Design (Prefix Adder). Chapter 2: ADDERS. Half Adders

ritaschmidt
Download Presentation

CSE 246: Computer Arithmetic Algorithms and Hardware Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 246: Computer Arithmetic Algorithms and Hardware Design Lecture 4: Adders Instructor: Prof. Chung-Kuan Cheng

  2. Topics: • Adders • AND/OR gate v.s. Circuit • Logic Design • Graph Design (Prefix Adder)

  3. Chapter 2: ADDERS • Half Adders • Half adders can add two 1-bit binary numbers when there is no carry in. • If the inputs are xi and yi, the sum and carry-out is given by the formula • si = xi ^ yi • ci+1 = xi . yi • We use the following notations throughout the slides • . means logical AND • + means logical OR • ^ means logical XOR • ‘ means complementation

  4. Full Adder • The inputs are x[i], y[i] (operand bits) and c[i] (carry in) • The outputs are s[i] (result bit) and c[i+1] (carry out) • Inputs and outputs are related by these relations • s[i] = x[i] ^ y[i] ^ c[i] • c[i+1] = x[i].y[i] + c[i].(x[i] + y[i]) = x[i].y[i] + c[i].(x[i] ^ y[i])

  5. Full Adder • If carry-in bit is zero, then full adder becomes half adder • If carry-in bit is one, then • s[i] = (x[i] ^ y[i])’ • c[i+1] = x[i] + y[i] • To add two n-bit numbers, we can chain n full adders to build a ripple carry adder

  6. Ripple Carry Adder x[0] y[0] cin/c[0] x[n-1] y[n-1] x[1] y[1] c[n-1] . . . c[1] c[2] cout s[n-1] s[1] s[0] Overflow happen when operands are of same sign, and the result is of different sign. If we use 2’s complement to represent negative numbers, overflow occurs when (cout ^ c[n-1]) is 1

  7. Ripple Carry Adder • For sake of brevity, we use the following notations: • g[i] = x[i].y[i] • p[i] = x[i] + y[i] • In terms of these notations, we can rewrite carry equations as • c[1] = g[0] + p[0].c[0] • c[2] = g[1] + p[1].c[1] • and so on… • We shall use these notations afterwards while discussing the design of other kind of adders • It has been observed that expected length of carry chain is 2, while expected maximal length of carry chain is lg n. Hence, ripple carry adders are in general fast.

  8. Ripple Carry Adder • How do know that an adder has completed the operation? • Worst case scenario: Wait for the longest chain in the carry propagation network • We might inspect c[i+1] and its complement b[i+1] to determine the status of the adder

  9. Improvement to Ripple Carry Adder: Manchester Adders • By intelligently using our device properties, we can reduce the complexity of the circuit used to compute carries in a ripple carry adder. • Define: a[i] = (x[i])’.(y[i])’ • Next we observe that c[i+1] is 1 in exactly these scenarios: • g[i] is 1, i.e. both x[i] & y[i] are 1 • c[i] is 1 and it is propagated because p[i] is 1 • c[i+1] is ‘pulled down’ to logic 0 irrespective of the value of c[i], when a[i] is 1, i.e. both x[i] and y[i] are 0 • From these conditions, and keeping in mind the general characteristics of transistor devices we can design simplified circuits for computing carries – as shown in the next slide

  10. Improvement to Ripple Carry Adder: Manchester Adders

  11. Implementation of Manchester Adder using MOS transistors This is essentially the same circuit for computing carry, but implemented with MOS devices

  12. Manchester Adder: Alternate design • We divide the computation cycle into two distinct half-cycle : ‘precharge’ and ‘evaluate’. In the precharge half-cycle, g[i] and c[i+1] are assigned a tentative value of logic 1. This is evaluated in the next half-cycle with actual value of a[i]. • The actual circuit for computing carries is shown in the next slide.

  13. Manchester Adder: Alternate design evaluation precharge Q Time 

  14. Carry Look-ahead Adder • In a ripple-carry adder m-full adders are grouped together (m is usually equal to 4). Once the carry-in to the group is known, all the internal carries and the output carry is calculated simultaneously. • We can use some algebraic manipulations to minimize hardware complexity. • Consider the carry out of the group • c[i] = g[i-1] + p[i-1].c[i-1] • Putting the value of c[i-1], we can rewrite as c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].c[i-2] • Proceeding in this manner we get c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i-1].p[i-2].p[i-3].g[i-4] + p[i-1].p[i-2].p[i-3].p[i-4].c[i-4] • To further simplify the equation, we note that g[i-1] = g[i-1].p[i-1], and p[i-1] can be factored out

  15. Ling’s Adder c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i-1].p[i-2].p[i-3].g[i-4] + p[i-1].p[i-2].p[i-3].p[i-4].c[i-4] We replace p[i]=x[i]^y[i] with t[i]=x[i]+y[i]. Because g[i]=g[i]t[i], we have c[i] = g[i-1]t[i-1] + t[i-1]g[i-2] + t[i-1].t[i-2].g[i-3] + t[i-1].t[i-2].t[i-3].g[i-4] + t[i-1].t[i-2].t[i-3].t[i-4].c[i-4] Let h[i] = g[i-1] + g[i-2] + t[i-2].g[i-3] + t[i-2].t[i-3].g[i-4] + t[i-2].t[i-3].t[i-4].t[i-5] h[i-4] C[i]= h[i]t[i-1]

  16. Ling’s Adder h[0]=c[0] h[3]=g[2]+g[1]+t[1]g[0]+t[1]t[0]h[0] s[3]=p[3]^c[3]=p[3]^(h[3]t[2]) =t[3]’h[3]t[2]+t[3](h[3]’+t[2]’) =h[3]’p[3]+h[3](p[3]^t[2]) h[6]=g[5]+g[4]+t[4]g[3]+t[4]t[3]t[2]h[3] s[6]=h[6]’p[6]+h[6]’(p[6]^t[5])

  17. Generalized Design for Adders: Prefix Adder • Prefix computation • Given n inputs x1, x2, x3…xn and an associative operator ×. We want to compute yi = xi× xi-1× xi-2…× x2× x1 for all i, 1≤ i ≤n • x can be a scalar/vector/matrix • For design of adders, we define the operator × in the following manner • (g, p) = (g’, p’) × (g’’, p’’) • g = g’’ + p’’.g’ • p = p’.p’’

  18. Alternate modeling of Prefix Computer: Finite State Machine • A finite state machine has a set of states, and it ‘moves’ from one state to another according to input. Mathematically, • sk = f (sk-1, ak-1) • The problem is to determine final state sn in O(lg n) operations, given initial state s0 and sequence of inputs (a0, a1, …an-1) • This problem can be formulated in terms of prefix computation

  19. Alternate modeling of Prefix Computer: Finite State Machine • We assume that number of states are small and finite. • Let sk = fak-1(sk-1), fak-1 can be represented by matrix Mak-1 • Now we are ready to represent our problem in terms of prefix computation.

  20. Alternate Modeling of Prefix Computer: Finite State Machine • The algorithm • Compute Mai in parallel • Compute • N1 = Ma1 • N2 = Ma2.Ma1 • … • Nn = Man.Man-1…Ma1 • Compute Si+1= Ni(S0)

  21. 0/0 0/0 1/0 0/0 1/0 A B C 1/1 M0 M1 PS NS PS NS X=0 X=1 A B A A B B B C C B C A Prefix Computation • FSM example: • Given: • initial state S0=A • A sequence of inputs: (0 0 1 1 1 0 1 0 1) • Derive the sequence of outputs Compute N’s: N1=M0 N2=M0 M0 N3=M1 M0 M0 N4=M1 M1 M0 M0 … Input Sequence: 0 0 1 1 … State table

  22. Graph Based Approach • Consider the (g p) chain • break the long paths g3 p3 g2 p2 C4 g1 p1 C1

  23. Graph Based Approach • Generating g32 and p32 g3 p3 g2 p2 g1 p1 C4 g3 p3 g2 p2 C1 g32 p32

  24. Graph Based Approach • Generating g10 and p10 g3 p3 g2 p2 g1 p1 C4 g1 p1 cin cin g10 p10

  25. g3 p3 g2 p2 g32 p32 g1 p1 cin g10 p10 Graph Based Approach • Generating g30 and p30 g32 p32 g10 g30 p10 p30

  26. Boolean Approach g4 + p4 ( g3 + p3 ( g2 + p2 ( g1 + p1 ( g0 + p0 cin ) ) ) ) g4 , p4 g3 , p3 g2 , p2 g1 , p1 g0 , p0 cin g4+p4g3 , p4p3 g2+p2g1 , p2p1 g0 , p0cin g4+p4g3+p4p3(g2+p2g1) , p4p3p2p1 g0 , p0cin g4+p4g3+p4p3(g2+p2g1)+(p4p3p2p1)g0 , (p4p3p2p1) p0cin

  27. Given: n inputs (gi, pi) An operation o Compute: yi= (gi, pi) o … o (g1, p1) ( 1 <= i <= n) Associativity (A o B) o C = A o ( B o C) Prefix Adder a, i=1 aibi , otherwise 1, i=1 ai xor bi , otherwise gi= pi= • (g’’, p’’) o (g’, p’) = (g, p) • g=g’’ + p’’g’ • p=p’’p’

  28. Prefix Adder: Graph Representation • Example: Ripple Carry Adder ai bi (gi , pi) x y xoy xoy

  29. Prefix Adders: Conditional Sum Adder 8 7 6 5 4 3 2 1

  30. Prefix Adders: Conditional Sum Adder • For output yi, there is an alphabetical tree covering inputs (xi, xi-1, …, x1) 8 7 6 5 4 3 2 1 • alphabetical tree: • Binary tree • Edges do not cross

  31. Prefix Adders: Conditional Sum Adder • From input x1, there is a tree covering all outputs (yi, yi-1, …, y1) 8 7 6 5 4 3 2 1 • The nodes in this tree can be reduced to (g, p) o c = g+pc

  32. Prefix Adders: size and depth • Objective: • Minimize # of nodes, sc(n). • Minimize depth, dc(n) • Ripple Carry Adder: • sc(8) = 7 • dc(8) = 7 • total = 14 • Conditional Sum Adder: • sc(8) = 12 • dc(8) = 3 • total = 15

  33. Prefix Adder –Well-known and Well-developed? • Classic prefix networks: Sklansky, Kogge-Stone, Brent-Kung, Ladner-Fischer, Han-Carlson, Knowles etc.

  34. Prefix Adders: Brent – Kung Adder 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 • sc(16) = 26 • dc(16) = 6 • total = 32

  35. Prefix Adder –New Respects, New Method • Realistic design considerations: Timing, Power and Area. • Integer Linear Programming for prefix adder: • Logic effort timing model (gate cap. + wire cap.) • Activity-statistic power model • Non-uniform signal arrival/required times Logic Levels Timing Power Area Max Fanouts Max Wire Tracks

  36. Prefix Adder –Optimum Prefix adders • Uniform signal arrival/required times Sklansky Adder Kogge-Stone Adder Fastest depth-3 optimal prefix adder Fastest depth-4 optimal prefix adder

  37. Prefix Adder –Optimum Prefix adders • Uniform signal arrival/required times

  38. The Big Picture What is the minimum depth of zero-deficiency circuits for a given width?

  39. Proof for Snir’s Theorem • Proof • Consider the alphabetical tree rooted at the MSB output with all the input nodes being its leaves; • The size of this tree is n-1 while its depth is dM; • At most dM prefix outputs can be generated from this tree; • At least one extra node is needed for the columns where the prefix results are not ready. Consequently size ≥ (n-1)+(n-(dM + 1)) = 2n -2 - dM which is size + depth ≥ 2n - 2 Given an arbitrary prefix graph of width n, we have depth + size ≥ 2n – 2

  40. Backbone Affiliated Tree Definitions For a prefix circuit, define • Backbone • The binary alphabetical tree generating MSB prefix output; • Affiliated tree • rooted at the LSB input, with all the prefix outputs (except MSB output) as its tree nodes • Ridge • the path from the LSB input to the MSB output.

  41. How to … ? • Look from the MSB output • Since the circuit is of zero-deficiency, the ridge has exactly d nodes (excluding the first input node), one node per level. • The idea: try to stretch the ridge as long as possible while maintaining zero-deficiency

  42. T-tree • Definition of Tk(k) tree

  43. T-tree example – T3(5)

  44. A-tree • Definition of Ak(t) tree

  45. A-tree example – A3(5)

  46. Compound of A tree and T-tree

  47. Example

  48. Proposed Prefix Circuit

  49. BK(32) 1 32 T3(5) + A3(5) 58 33 T2(6) + A2(6) T1(7) + A1(7) 80 59 88 81 An Example: Z(d)|d=8 Width = 88

  50. The width of Z(d) Circuit • The width of Z(d) circuit is Nz(d) = F(d+3) – 1 (d≥1) Where F(i) are the Fibonacci numbers • Numerical Comparison LYD : Design by S. Lakshmivarahan, C.M. Yang & S.K. Dhall, 1987 LS : Design by Lin & Shish, 1999

More Related