A Timing-Driven Hybrid-Compression Algorithm for Faster Sum-of-Products

A Timing-Driven Hybrid-Compression Algorithm for Faster Sum-of-Products Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University

e d f b c a q = c * d p = a * b q p z = p + q + e + f z What is a Sum-of-Product? • IC block that performs addition of multiple product and sum terms • Computationally-intensive • Wide usage in DSP, Graphics, Microprocessors

Multiplication {assign z = a * b} MAC {assign z = (a * b) + c} 2-operand Addition {assign z = a + b} Squarer {assign z = a * a} Adder-Tree {assign z = a + b + c + d} Generalized SOP {assign z = (a * b) + (c * d) + (e * f) + g + h + k} Examples of Sum-of-Product Blocks

Structure of Sum-of-Products Inputs • Sum-of-Products block consists of 3 parts (written in the order of data-flow) • Partial Product Generator (PPGen) • Partial Product Reduction Tree (PPRT) • Final Carry-Propagation Adder (CPA) Partial Product Generator (PPGen) Partial Product Reduction Tree (PPRT) Final Carry Propagation Adder (CPA) Output

Partial Product Reduction Tree • In Partial Product Reduction Tree, total number of elements in each bit gets reduced to upto two • Partial Product Reduction Tree (PPRT) consumes >50% delay of the SOP block • Hence the performance of PPRT is crucial to the performance of the SOP block

ai bi Ci+1 Si ai bi ci Ci+1 Si Two Reduction Counters in PPRT • (2:2) Counter • Reduces 2 inputs (ai and bi) to 2 outputs (Si and Ci+1) • (3:2) Counter • Reduces 3 inputs (ai, bi and ci) to 2 outputs (Si and Ci+1)

ai bi ci di Ci+2Ci+1 Si 4:3 Reduction Counters • (4:3) Counter • 4 inputs to 3 outputs • The functionality of the Ci+2 is a 4-input AND gate. • Faster reduction at ith column • Produces element to (i+2)th column at an earlier time • Has larger area than other two counters Key idea is to use (4:3) counter as much as possible in conjunction with the (3:2) and (2:2) counters

Explanation of our approach • Perform column-wise reduction (LSB to MSB) • For each column (or BitSlice/BitCluster) • Sort inputs based on arrival time • Is (2:2) reduction fast? If yes, instantiate that • Else is (3:2) reduction fast? If yes, instantiate that • Else instantiate (4:3) reduction • After each reduction, re-sort the signals and continue

An example of our approach P07 P06 P05 P04 P03 P02 P01 P00 P17 P16 P15 P14 P13 P12 P11 P10 P27 P26 P25 P24 P23 P22 P21 P20 P37 P36 P35 P34 P33 P32 P31 P30 P47 P46 P45 P44 P43 P42 P41 P40 P57 P56 P55 P54 P53 P52 P51 P50 C02 C01 S00 C11 S10 C03 S01 C13 C12 S02

Results On an average, our approach produces about 3.5% speed improvement with 4.3% area penalty

Summary • A 4:3 reduction counter is designed • Reduces elements in the given column at a faster pace • Produces an element to the (i+2)th column at an earlier time • 4:3 reduction counter is used extensively (in conjunction with the existing 3:2 and 2:2 counters) • A timing-driven algorithm selects the correct type of counter that needs to be instantiated • On an average, 3.5% improvement in speed with 4.3% area penalty.

Thank you

A Timing-Driven Hybrid-Compression Algorithm for Faster Sum-of-Products

A Timing-Driven Hybrid-Compression Algorithm for Faster Sum-of-Products

Presentation Transcript

A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products

Compression Algorithm

gFPC: A Self-Tuning Compression Algorithm

Sum of Products

Timing-Driven Synthesis for Fast Barrel Shifters

The Sum-Product Algorithm

L12-Algorithm timing

Timing Event-driven simulation

Timing-Driven Placement for Heterogeneous FPGA

A Thermal-Driven Floorplanning Algorithm for 3D ICs

Energy Sum Algorithm

Image Compression and Graphics: More Than a Sum of Parts?

Hybrid Algorithm for mean T

Sum-of-Products (SOP)

A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment

A Hybrid Match Algorithm for XML Schemas

A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment

A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products

Algorithm of Aggregate Function SUM

Sum-of-Products (SOP)

Image Compression and Graphics: More Than a Sum of Parts?

A hybrid quantization scheme for image compression