220 likes | 233 Views
Learn step-by-step ALU design process, reducing complexity, improving performance, and implementing carry-lookahead adder.
E N D
Topic 3b Computer Arithmetic: ALU Design Introduction to Computer Systems Engineering (CPEG 323) cpeg323-08F\Topic3b
Design Process • Design • components • how they are put together • Top Down decomposition of complex functions • Bottom-up composition of primitive cpeg323-08F\Topic3b
Problem: Design an ALU • Operations • add, addu, sub, subu, addi, addiu • 2’s complement adder/sub with overflow detection • and, or, andi, ori • bitwise operations Total number of operations = 10 cpeg323-08F\Topic3b
Design: divide & conquer method • Break the problem into simpler parts • Work on the parts • Put pieces together • Verify solution works as a whole • Example: Separate immediate instructions from the rest. • Process immediates before ALU • ALU inputs now uniform • 6 non-immediate operations remain • Need 3 bits to specify the ALU mode cpeg323-08F\Topic3b
Design – First Steps • Complete functional specification first • inputs: 2 x 32-bit operands A, B, 3-bit operation code • outputs: 32-bit result R, 1-bit carry, 1 bit overflow • operations: add, addu, sub, subu, and, or • High-level block diagram completed next cpeg323-08F\Topic3b
Design – Reducing the problem to something simpler • For our ALU, reduce 32-bit problem intosimpler 1-bit slices. • Changes big combinational problem to a small combinational problem • Put the pieces together to solve the big problem. cpeg323-08F\Topic3b
Designing with lower-level block diagrams 1-Bit ALU block Replicate 32 times for a 32-bit ALU Replicate 32 times for a 32-bit ALU cpeg323-08F\Topic3b
The 1-Bit ALU Block • Partition into separate/independent blocks • logic • arithmetic • Complete each block at this level or further refine. • Complete logic block • Complete function select • Decompose arithmetic block into simpler parts cpeg323-08F\Topic3b
(a *b*Ci) + (a * b*Ci) + (a *b* Ci) +(a *b*Ci) 1-bit Add • Computing A + B • Sum= • Co = (a* Ci) + (b * Ci) + (a * b) • This is called full adder. A half adder assumes no Ci. • Can you draw the 1-bit adder according to the above logic? • # of gate delays for Sum = 3 • # of gate delays for Carry = 2 cpeg323-08F\Topic3b
1-bit Subtraction • Convert subtraction to addition • XOR complements the input B • Setting CarryIn adds 1 (if least significant bit) cpeg323-08F\Topic3b
Completing the ALU • Overflow detection & opcode decoder cpeg323-08F\Topic3b
Overflow • Overflow can be detected decoding the • Carry into MSB and the Carry out of MSB cpeg323-08F\Topic3b
Overflow Detection Logic • Carry into MSB XOR Carry out of MSB • For a N-bit ALU: Overflow = CarryIn[N - 1] XOR CarryOut[N - 1] cpeg323-08F\Topic3b
Evaluating Performance • Logic path has three gate delays • XOR + AND/OR + MUX • Add/sub • 1 gate delay for XOR • 3 gate delays for SUM and 2 for CarryOut • Each bit slice depends on Ci: the output of the previous slice. • For an N-bit Adder the worst case delay is then 2 *N gate delays • This worst case delay describes a ripple adder cpeg323-08F\Topic3b
Evaluating Performance – ALU Block • The ALU speed is limited by its slowest block. • The logic block has 2 gate delays • The add/subtract has 2*N + 1 gate delays, where N >> 1 • The arithmetic block is significantly limiting performance • Consider ways to reduce gate delays in adder cpeg323-08F\Topic3b
Speeding up the ripple carry adder • Eliminating the ripple c1 = b0*c0 + a0*c0 + a0*b0 c2 = b1*c1 + a1*c1 + a1*b1 c3 = b2*c2 + a2*c2 + a2*b2 c4 = b3*c3 + a3*c3 + a3*b3 cpeg323-08F\Topic3b
Carry Look Ahead • When both inputs 0, no carry • When one is 0, the other is 1, propagate carry input • When both are 1, then generate a carry cpeg323-08F\Topic3b
Carry-lookahead adder • Generate gi = ai * bi • Propagate pi = ai + bi • Write carry out as function of preceding g, p, & co c1 = g0 + p0*c0 c2 = g1 + p1*c1 c3 = g2 + p2*c2 c4 = g3 + p3*c3 cpeg323-08F\Topic3b
Reducing the complexity • C1 = g0 + (p0 * C0) • C2 = g1 + (p1 * [g0 + p0 * C0]) = g1 + (p1 * g0) + (p1 * p0 * C0) • C3 = g2 + (p2 * g1) + (p2 * p1 * g0) + (p2 * p1 * p0 * c0) • C4=? Increase speed at what cost ? Can you illustrate how to build a 32-bit adder with carry look ahead? cpeg323-08F\Topic3b
Limitations • The number of inputs of the gates drastically increases • Technology permits only a certain maximal number of inputs (fan-in) • Realization of a gate with high fan-in by a chain of gates with low fan-in. From Prof.Michal G. Wahl cpeg323-08F\Topic3b
Use principle to build a 16-bit adders • Let us add a second-level abstractions! • Using a 4-bit adder as a first-level abstraction cpeg323-08F\Topic3b
4-bit wide carry-lookahead • P0 = p3 * p2 * p1 * p0 • P1 = p7 * p6 * p5 * p4 • P2 = p11 * p10 * p9 * p8 • P3 = p15 * p14 * p13 * p12 • G0 = g3 + (p3 * g2) + (p3 * p2 * g1) + (p3 * p2 * p1 * g0) • G1 = • G2 = • G3 = cpeg323-08F\Topic3b