190 likes | 363 Views
Routing Wire Optimization through Generic Synthesis on FPGA Carry. Hadi P. Afshar Joint work with: Grace Zgheib , Philip Brisk and Paolo Ienne. FPGAs and ASICs Gaps*. How to narrow the gap ? Specialized (DSP) blocks Coarser grained logic b locks Hard-wired connections.
E N D
Routing Wire Optimizationthrough Generic Synthesis on FPGA Carry Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne
FPGAs and ASICs Gaps* • How to narrow the gap? • Specialized (DSP) blocks • Coarser grained logic blocks • Hard-wired connections *I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs“, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 26, NO. 2, FEBRUARY 2007, pp. 203 – 215. Routing resources consume ≈60-80% of the chip area and are significant contributors to circuit delay. • Concerns: • Lack of generality and flexibility • Underutilization • Change in routing structure • Performance • Ratio: 3-4 • Area • Ratio: 20-35 • Power • Ratio: 7-15
Carry Chains CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB 4-LUT 4-LUT 8 Inputs 4-LUT + 4-LUT +
Problem Definition LUT Mapped Flow Graph Step1: Logic Matching Step2: Chaining
Cin A Logic Matching LUT + LUT B Cout Step1: Enumeration of Programmable Part Step2: Identifying regular and independent segments Step3: Developing alphabet library of the macro cell Step4: Mask division and library matching
Logic Matching (Example) Step1: Enumeration
Logic Matching (Example) Step2: Regular and Independent Segments
A0 = 0 A1 = 0 B0 = 0 B1 = 0 A0 = 1 A1 = 0 B0 = 0 B1 = 0 A0 = 0 A1 = 1 B0 = 0 B1 = 0 A0 = 1 A1 = 1 B0 = 0 B1 = 0 A0 = 0 A1 = 0 B0 = 1 B1 = 0 Logic Matching (Example) Step3: Alphabet library of the cell
Logic Matching (Example) 8-bit 8-bit 8-bit 8-bit Library Step4: Mask segmented matching
How much we gain? • Order of magnitudes less memory • Order of magnitudes less comparisons • Assume that mask is 32-bit • N Segments • M Patterns in each segment • Our Library Size = Bits • Num of all configurations =
Chaining Heuristic Input Input Input We need to find chains of functions, which are mappable to the macrocell, to be placed on the carry chains 1 1 2 1 3 2 2 0 0 1 4 5 5 Output Output Output
Synthesis and Chaining Results * The minimum threshold for the chain length is 4, except for “des” which is 3.
Experimental Methodology Quartus-II LUT Mapping & Syn VQM Parser DAG Generation Logic Matching Our Synthesis Engine Chaining Heuristic Netlist Generation Quartus-II Goal: Extract chains of eligible functions from the synthesizednetlist in order to place them on the logic chains; the non-chained ones are remained unchanged. Place & Route
Local Routing Wires 26% saving in local wires number
Total Wire Lengths 9% saving in total wire lengths
Delay 3% delay penalty due to large in-out delay of the adder
Conclusion Narrow the FPGA and ASIC Gaps Hardwired connections + Dedicated logic Lighten the stress on routing resources Improved Routability with a Lighter Network
Thanks for your attention. hadi.parandehafshar@epfl.ch