180 likes | 311 Views
Reducing the Pressure on Routing Resources of FPGAs with Generic Logic Chains. Hadi P. Afshar Joint work with: Grace Zgheib , Philip Brisk and Paolo Ienne. FPGAs and ASICs Gaps*. How to narrow the gap ? Specialized (DSP) blocks Coarser grained logic b locks Hard-wired connections.
E N D
Reducing the Pressure on Routing Resources of FPGAs with Generic Logic Chains Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne
FPGAs and ASICs Gaps* • How to narrow the gap? • Specialized (DSP) blocks • Coarser grained logic blocks • Hard-wired connections *I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs“, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 26, NO. 2, FEBRUARY 2007, pp. 203 – 215. Routing resources consume ≈60-80% of the chip area and are significant contributors to circuit delay. • Concerns: • Lack of generality and flexibility • Underutilization • Change in routing structure • Performance • Ratio: 3-4 • Area • Ratio: 20-35 • Power • Ratio: 7-15
FracturableLUTs i0 i1 i2 2-LUT 3-LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT LUT S0 S2 S3 S5 S7 S1 S4 S6 2-LUT
Motivation CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB 4-LUT 6-LUT 5-LUT 4-LUT 5-LUT 4-LUT 8 Inputs 4-LUT 3-LUT 4-LUT 5-LUT 4-LUT Fracturable LUT structure and extra CLB outputsreduce the problem of large LUT under-utilization.
What is the solution? CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB 4-LUT 4-LUT 8 Inputs 4-LUT 4-LUT + ? + ? • More input bandwidth • Improved logic density • Dedicated and faster connections
Vertical Look-Up Tables 5-LUT 4-LUT A 5-LUT can be built by two 4-LUTs with shared inputs and a multiplexer that selects between the two sub-LUTs and is controlled by the 5th input. 4-LUT 5-LUT 4-LUT 4-LUT + Fanout + • Two 5-LUTs in the logic cell with disjoint inputs • No routing wire is needed for the interconnection • No change in the routing network interface
Example Hard-wired logic chain Routing wire F(i0,i1, ... i15) F(i0,i1, ... i12)
Chaining Heuristic Input Input Input We need to find chains of functions, which have 5 or less number of inputs, to be mapped on the logic chains (vertical 5-LUTs) 1 1 2 1 3 2 2 0 0 1 4 5 5 Output Output Output
Synthesis and Chaining Results * The minimum threshold for the chain length is 4, except for “des” which is 3.
Experimental Methodology Quartus-II VQM Parser ABC? VPR? 5-LUT 4-LUT + 4-LUT 5-LUT Quartus-II 4-LUT Similar Interface Goal: Extract chains of eligible functions from the synthesizednetlist in order to place them on the logic chains; the non-chained ones are remained unchanged. + 4-LUT
Logic Cell Utilization 4% saving in the ALM counts
Local Routing Wires 37% saving in local wires number
Total Wire Lengths 12% saving in total wire lengths
Delay No average delay penalty for the placement restriction
Did I say something new?! • Local connection in Altera Stratix and Cyclone • Use available logic cell bandwidth • No fracturable LUT structure • Local connections in Xilinx FPGAs, goes through multiplexers • Carry look-ahead • Wide AND functions • Cascading LUTs to build bigger LUTs in Xilinx Virtex-5 • Routing wire • Few large functions
Conclusion More logic density Less circuit delay Less Power Narrow the FPGA and ASIC Gaps Hardwired connections + Dedicated logic Less routing wires Lighten the stress on routing resources More LC bandwidth Improved Routability with a Lighter Network
Future Work Logic chain aware synthesis Guided chaining heuristic Multiple logic chains 2-D logic chains
Thanks for your attention. hadi.parandehafshar@epfl.ch