1 / 28

Low-power, High-speed Multiplier Architectures

Low-power, High-speed Multiplier Architectures. Shawn Nicholl ELEC-5705y March 7, 2005. Agenda/Overview. Design Abstraction Numbering Systems Addition and Subtraction Adder Architectures Multiplication Traditional Multiplier Architectures Advanced Multiplier Architectures.

Download Presentation

Low-power, High-speed Multiplier Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Low-power, High-speed Multiplier Architectures Shawn Nicholl ELEC-5705y March 7, 2005

  2. Agenda/Overview • Design Abstraction • Numbering Systems • Addition and Subtraction • Adder Architectures • Multiplication • Traditional Multiplier Architectures • Advanced Multiplier Architectures Low-Power, High-Speed Multiplier Architectures

  3. Levels of Abstraction in Digital ICs • Low-power, high-speed techniques can be used at many levels of abstraction • Higher levels of abstraction have greater effect on overall system performance Systems Increasing Abstraction Modules Multiplier Architectures Logic Gates Circuits Devices Low-Power, High-Speed Multiplier Architectures

  4. 2’s Comp 0 0 1 0 1 1 0 1 1 1 0 1 0 0 1 0 Eg. 1 45d = 0+0+25+0+23+22+0+20 1 1 0 1 0 0 1 1 Numbering Systems – A Quick Review • Some common numbering systems: • Decimal Range: 0 to 10n-1 • Unsigned Binary Range: 0 to 2n-1 • Two’s-Complement Range: -2n-1 to +(2n-1 –1) Low-Power, High-Speed Multiplier Architectures

  5. Example: Add –45d to 10d Two’s Complement Method Step1) Initialize Step2) Add (no special rules) 10d -45d -45d 10d 45d -10d 45d -10d 35d -35d Step1) Initialize Step2) Compare so that augend holds larger number Step3) Treat as a subtraction Step4) Do subtraction (borrows may be required) Step5) Negate result (knowing that augend was negative) 10d = 0000 1010b -45d = 1101 0011b 0000 1010b 1101 0011b 1101 1101b Converting 2’s Comp back to decimal: 1101 1101b = -35d Adding and Subtracting • Two’s-complement algorithm is consistent • Addition and subtraction and behave the same • Negative numbers treated same as positive numbers Low-Power, High-Speed Multiplier Architectures

  6. Two’s Complement Method 10d = 0000 1010b -45d = 1101 0011b Step1) Initialize Step2) Invert subtrahend and set CIN = 1 1b 0000 1010b 0010 1100b 0011 0111b Converting 2’s Comp back to decimal: 0011 0111b = 55d Subtraction logic can be shared with addition logic! Adding and Subtracting (Example 2) Example2: Subtract –45d from 10d Signed Decimal Method 10d - -45d 10d + 45d 55d Step1) Initialize Step2) Subtrahend is negative, so negate it and do an addition Low-Power, High-Speed Multiplier Architectures

  7. Adder Building Blocks • Half Adder Sn = An Bn COn = An• Bn • Full Adder Sn = An Bn CINn COUTn = An• Bn• CINn Low-Power, High-Speed Multiplier Architectures

  8. Adder Architectures (CRA) • Carry Ripple Adder (CRA) • Gate Count N  Area N • Delay N • Power N • Layout friendly (low fan-in/fan-out; regular structure) Low-Power, High-Speed Multiplier Architectures

  9. Generates Propagates 1 Source: Patterson and Hennessy, Figure A.14 Adder Architectures (CLA) • Carry Lookahead Adder (CLA) • Generate: Gn = An• Bn • Propagate: Pn = An + Bn • Recursive Relationship: CINn = Gn-1+ Pn-1• CINn-1 CINn = Gn-1+ Pn-1Gn-2 + Pn-1Pn-2…P1G0 + Pn-1Pn-2…P0CIN0 • CLA: • Delay  log2N (if built right) • Gate count, power are greater than CRA • Not layout friendly (high fan-in; difficult to route) Low-Power, High-Speed Multiplier Architectures

  10. Adder Architectures (CSA) • Carry Save Adder • Adders work independently, so very fast • Pipelined architecture results in flops and control logic, which increase area and latency Low-Power, High-Speed Multiplier Architectures

  11. Two’s Complement Method Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products Multiplicand Multiplier 118d 99d 1062d 1062 d 11682d Unsigned Multiplication Example: Multiply 118d by 99d • Shift-and-Add Algorithm 118d = 0111 0110b 99d = 0110 0011b 01110110b 01110110 b 00000000 b 00000000 b 00000000 b 01110110 b 01110110 b 00000000 b 010110110100010 b Convert 2’s-Comp back to decimal: 0010 1101 1010 0010 = 11682d Low-Power, High-Speed Multiplier Architectures

  12. Shift-and-Add Multiplier • Shift-and-Add Multiplier • Take N cycles to complete: TLat= (TN-bitADD+Tshift)xN • Requires minimal logic (most logic is in the adder) B Multiplicand X A Multiplier P Product Low-Power, High-Speed Multiplier Architectures

  13. Extra Hardware! Basic Signed Multiplication • Basic Idea • Convert to Unsigned • Use Shift-and-Add Multiplier • Convert to Signed Low-Power, High-Speed Multiplier Architectures

  14. Signed Multiplication • Booth Recoding • Reduce the number of partial products by re-coding the multiplier operand • Works for signed numbers Low-order Bit Last Bit Shifted Out Example: Multiply -118d by -99d Recall, 99d = 0110 0011b 1001 1100b 1b -99d = 1001 1101b Radix-2 Booth Recoding -99d = Low-Power, High-Speed Multiplier Architectures

  15. Radix-2 Booth Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products -99d = Sign Extension Radix-2 Booth Multiplication Example: Multiply -118d by -99d B = -118d = 1000 1010b -B = 118d = 0111 0110b A = -99d = 1001 1101b -118d = 0111 0110b -B B -B 0 0 B 0 -B -99d = 01110110b 110001010 b 01110110 b 00000000 b 00000000 b 1110001010 b 000000000 b 01110110 b 0010110110100010 b Convert 2’s-Comp back to decimal: 0010 1101 1010 0010 = 11682d Low-Power, High-Speed Multiplier Architectures

  16. 01110110b -B B -B -118d = 0111 0110b 110001010 b 01110110 b -B B -B 0 0 B 0 -B 01110110b 110001010 b 01110110 b 00000000 b 0 -99d = 00000000 b 00000000 b 1110001010 b 000000000 b 01110110 b 0010110110100010 b 00000000 b 0 1110001010 b B 0 000000000 b 01110110 b -B Array Multiplier • Array Multiplier • Combinatorial, so it is very fast – delay N • Can be pipelined • Very regular structure Low-Power, High-Speed Multiplier Architectures

  17. Array Multiplier Structure Source: J. Kuo and J. Lou, Low-Voltage CMOS VLSI Circuits, 1999 Low-Power, High-Speed Multiplier Architectures

  18. Radix-4 Booth Multiplication • Similar to Radix-2, but uses looks at two low-order bits at a time (instead of 1) Low-order Bits Last Bit Shifted Out Recall, 99d = 0110 0011b 1001 1100b 1b -99d = 1001 1101b Radix-4 Booth Recoding -99d = Low-Power, High-Speed Multiplier Architectures

  19. Radix-4 Booth Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products -99d = B -B 2B -2B 111111110001010b 01110110 b 11100010100 b 011101100 b 0010110110100010 b Sign Extension Radix-4 Booth Multiplication Example: Multiply -118d by -99d • Reduces number of partial products by half! B = -118d = 1000 1010b -B = 118d = 0111 0110b 2B = -236d = 1 0001 0100b -2B = 236d = 0 1110 1100b A = -99d = 1001 1101b -118d = 0111 0110b -99d = Convert 2’s-Comp back to decimal: 0010 1101 1010 0010 = 11682d Low-Power, High-Speed Multiplier Architectures

  20. Tree Multiplier • Wallace Tree • Reduces the total number of full-adders • Uses 3:2 Compressor (aka Full Adder) • Delay  log3/2N • Irregular structure is difficult to layout Original Structure Tree Structure Source: J. Kuo, et. al., Low-Voltage CMOS VLSI Circuits, 1999 Low-Power, High-Speed Multiplier Architectures

  21. Even data bits on rising clock Parallel Feed One Operand Serial Feed One Operand Odd data bits on falling clock • Low Power • Low Area • High latency Twin Pipe Serial-Parallel Multiplier • Features Source: S. Shah, et.al., “Comparison of 32-bit Multipliers for Various Performance Measures”, 2000. Low-Power, High-Speed Multiplier Architectures

  22. Cluster Multiplication • Divide circuit into clusters of nibble-wide multiplications • If all bits in a nibble are zeroes, then use clock-gating to gate multiplication for that nibble • Features • Low Power (claims 13% savings) Source: A. Fayed, M. Bayoumi, “A Novel Architecture for Low-Power Design of Parallel Multipliers”, 2001. Low-Power, High-Speed Multiplier Architectures

  23. Multiplexer-Based Array Multiplier • Characteristics • Fast (because it is array-based) • Unlike Booth, does not require encoding logic Source: K. Pekmestzi, “Multiplexer-Based Array Multipliers”, 1999. • Processes 1 bit of multiplier and 1 bit of multiplicand at a time, thus it is symmetric • Has a zigzag shape, thus not layout-friendly Low-Power, High-Speed Multiplier Architectures

  24. Area-Efficient Multiplexer-Based Multiplier • Characteristics • Increases each row to have N+1 cells (instead of N) • Depth is cut in half (increases “squareness”) Source:Y. Wang, Y. Jiang, E. Sha, “On Area-Efficient Low Power Array Multipliers”, 2001. Low-Power, High-Speed Multiplier Architectures

  25. Low Latency Booth-Encoding-based Pipeline Multiplier • Features • Delay  N/4 • Needs (N+N/2)-bit addition at end • Uses CLA’s instead of CSA’s because longest stage (i.e. adder at end) determines fastest operating frequency Source: X. Wu, H. Chen, S. Wei, “Design of a Low Latency High Speed Pipelining Multiplier”, 2001. Low-Power, High-Speed Multiplier Architectures

  26. Two’s Complement Gray-Encoded Array Multiplier • Characteristics • Uses gray code to reduce the switching activity of multiplier • Claims that traditional Booth uses 45% more power • Greater area than traditional Booth Source: E. Costa, et.al., “A New Architecture for 2’s Complement Gray Encoded Array Multiplier”, 2002. Low-Power, High-Speed Multiplier Architectures

  27. Project Plan Low-Power, High-Speed Multiplier Architectures

  28. References • S. Shah, A.J. Al-Khalili, D. Al-Khalili, “Comparison of 32-bit Multipliers for Various Performance Measures”, Proc. 2000 Int’l Conf. Microelectronics, pp. 75-80, 2000. • D. Patterson, J. Hennessy, 2nd, ed., Computer Architecture – A Quantitative Approach, San Francisco, CA: Morgan Kaufmann Publishers, Inc., 1996. • X. Wu, H. Chen, S. Wei, “Design of a Low Latency High Speed Pipelining Multiplier”, Proc. 2001 Int’l Conf. on ASIC, pp. 551-554, 2001. • J. Wakerly, 2nd, ed., Digital Design – Principles and Practices, Eaglewood Cliffs, NJ: Prentice Hall, 1994. • J. Kuo and J. Lou, Low-Voltage CMOS VLSI Circuits, New York, NY: John Wiley & Sons, Inc., 1999. • K. Pekmestzi, “Multiplexer-Based Array Multipliers”, IEEE Trans. on Computers, vol. 48, pp. 15-23, 1999. • A. Fayed, M. Bayoumi, “A Novel Architecture for Low-Power Design of Parallel Multipliers”, Proc. 2001 IEEE Computer Society Workshop on VLSI, pp. 149-154, 2001. • Y. Wang, Y. Jiang, E. Sha, “On Area-Efficient Low Power Array Multipliers”, Proc. 2001 IEEE Int’l Conf. On Electronics, Circuits and Systems, vol. 3, pp. 1429‑1432, 2001. Low-Power, High-Speed Multiplier Architectures

More Related