320 likes | 341 Views
This study presents innovative FPGA area reduction methods using multi-output sequential resynthesis. By integrating combinatorial and sequential approaches, up to 10% area reduction is achieved in comparison to combinatorial resynthesis alone. The research explores Boolean matching and SAT-based techniques for multiple-output functions, providing insights into efficient circuit optimization. Experimental results with MIMO and MISO resynthesis approaches are analyzed, highlighting the benefits of sequential strategies in reducing FPGA area utilization. The work addresses limitations in existing resynthesis methods and proposes novel algorithms for improved FPGA design efficiency.
E N D
FPGA Area Reduction by Multi-Output Sequential Resynthesis Yu Hu1, Victor Shih2, Rupak Majumdar2 and Lei He1 1Electrical Engineering Dept., UCLA 2Computer Science Dept., UCLA Presented by Yu Hu Address comments to lhe@ee.ucla.edu
Outline • Background and Motivation • Combinational Resynthesis with MIMO Blocks • Sequential Resynthesis • Experimental Results • Conclusion and Future Work
Background • Area-optimal Technology Mapping for LUT-based FPGAs is NP-Hard [Farrahi, TCAD’94] • Post-mapping resynthesis is effective to reduce area (LUT#) [Ling, DAC’05] Area reduction Fault tolerance, power optimization, physical-aware optimization, and many others.
Boolean Matching Based Resynthesis • Attempt to re-map a logic block to reduce LUT# • BM can be used to handle both homogenous and heterogeneous PLBs (Source: Andrew Ling, University of Toronto, DAC'05)
Overall Flow of BM-based Resynthesis • Multi-iterations of block-based Boolean Matching (Source: Andrew Ling, University of Toronto, DAC'05)
Limitations of Existing Work • Considering single-output logic blocks • Considering combinational portion of the circuit • A larger solution space can be explored and area could be reduced if • Multiple-output logic blocks are considered • FF boundaries are eliminated
Resynthesis is restricted by FF boundaries … Retiming creates chances for resynthesis Motivation Example – Retiming 2-LUT network
Function of O2 has to be preserved … Only 1-LUT reduction Motivation Example – MISO Resynthesis 2-LUT network
60% area reduction is obtained by sequential MIMO resynthesis! Motivation Example – MIMO Resynthesis 2-LUT network
Major Contributions • Present a Boolean matching based resynthesis algorithm considering multi-output logic blocks • Propose a sequential resynthesis technique • Reduce area by up to 10% compared to combinational resynthesis, when both using MIMO blocks
Outline • Background and Motivation • Combinational Resynthesis with MIMO Blocks • SAT-based Boolean Matching for Multiple Output Functions • Resynthesis Algorithm • Experimental Results • Sequential Resynthesis • Experimental Results • Conclusion and Future Work
Existing Boolean Matching for MISO 2-LUT f g 2-LUT 2-LUT 2-LUT ? 2-LUT • Formulate the sub-problem of resynthesis to Boolean matching (BM) • BM: Can function fbe implemented in circuit g ? • Resynthesis: Is there a configuration to gso that for all inputs to g, f is equivalent to g? (Source: Andrew Ling, University of Toronto, DAC'05)
SAT-BM for Multi-Output Functions G LUT [i1, i2 ,F] = ( i1 + i2+ ¬L0 + F) ( i1 + i2+ L0 + ¬ F) ( i1 + ¬ i2+ ¬L1 + F) ( i1 + ¬ i2+ L1 + ¬ F) (¬ i1 + i2+ ¬L2 + F) (¬ i1 + i2+ L2 + ¬ F) (¬ i1 + ¬ i2+ ¬L3 + F) (¬ i1 + ¬ i2+ L3 + ¬ F) G = G LUT1 [x1,x2 , F2] ·G LUT2 [F2 ,x3 , F1] Characteristic function Configuration bits are encoded as SAT literals
SAT-BM for Multi-Output Functions G = G LUT1 [x1,x2 , F2] ·G LUT2 [F2 ,x3 , F1] The solution of this SAT problem corresponds to the Boolean matching results SAT! Replicated SAT Problem: G expand = G[X/000, F1/0, F2/0] · G[X/001, F1/0, F2/0] G[X/010, F1/1, F2/0] · G[X/011, F1/0, F2/0] G[X/100, F1/1, F2/0] · G[X/101, F1/0, F2/0] G[X/110, F1/1, F2/1] · G[X/111, F1/1, F2/1]
Unique Problem of MIMO Synthesis • MIMO-resynthesis can generate new path in the block • The new path might cause combinational cycles • Conservative solution: detect combinational cycles and discard resynthesis solutions with cycles False path? 3 PI 1 4 PO 5 2 Combinational cycle!
Experimental Settings • Implementation in OAGear • SAT-BM uses miniSAT2.0 • 20 biggest MCNC benchmarks are tested • 10 combinational • 10 sequential • mapped with 4-LUTs by Berkeley ABC • Resynthesis settings • One traversal is performed • Blocks with up to 10 inputs are considered • Results are verified by ABC equivalency checkers
Experimental Settings – PLB templates • All three possible structures for PLBs with up to 10 inputs and less than 4 4-LUTs [Ling, DAC’05] • All intermediate wires are treated as the outputs in MIMO resynthesis
Combinational Resynthesis: MISO vs. MIMO • MIMO does not out-perform MISO significantly, probably due to • Rejecting “false paths” introduced by MIMO resynthesis • Narrow PLB templates • Small block size and LUT size • No iterations of re-synthesis
Outline • Background and Motivation • Combinational Resynthesis with MIMO Blocks • Sequential Resynthesis • Experimental Results • Conclusion and Future Work
Structure Impact on Sequential Resynthesis • The structure of a logic block decides the sequential resynthesis strategies • Retiming • Classic retiming • All edges have non-negative weights after retiming • Peripheral retiming • Result in negative number of FFs at peripheral edges • Logic Duplication • Allow duplication • Not allow duplication
Case I: Classic Retiming w/o Duplication Step1: backward retiming Step2: combinational resynthesis Step3: forward retiming
Case II: Peripheral Retiming w/o Duplication Step1: peripheral retiming Brorrow FFs from outside. Step3: check feasibility of forward retiming A resynthesis solution w/ feasible retiming Step2: combinational resynthesis
Case II: Peripheral Retiming w/o Duplication Step4: forward retiming
Case III: Retiming w/ Duplication FF not movable! Duplication is required to enable retiming! FF# = 0 FF# = 1
Case III: Peripheral Retiming w/ Duplication FF not movable! Identical configuration for LUT-c and LUT-d.
β1 β2 α1+β1 α2+β1 α3+β1 α4+β1 **α3+β2 α4+β2 = 1 0 1 1 * * 0 0 α1 α2 α3 α4 Duplication or Not?– A Sufficient and Necessary Condition • An acyclic block is feasible for retiming w/o duplication iff [Brayton, TCAD’91] • All input-output paths have the same FF# • There exist numbers αi and βj for input i and output j, s.t. FF# in (i,j) path is equal to (αi+βj ) α1 = 1, α2 = 0, α3 = 1, α4 = 1, β1 = 0, β2 = -1
Duplication or Not?– A Sufficient and Necessary Condition • An acyclic block is feasible for retiming w/o duplication iff [Brayton, TCAD’91] • All input-output paths have the same FF# • There exist numbers αi and βj for input i and output j, s.t. FF# in (i,j) path is equal to (αi+βj ) • Time complexity • O(e min(m,n)) • Negligible for small block • Classic or peripheral retiming? • Classic retiming iff there exist non-negative αi and βj
Can We Accept Every Single Resynthesis? – Feasibility Checking for Sequential Resynthesis • Initial State Computation • Filter out some of the rewriting steps so that an equivalent initial state for the synthesized machine can be computed from a given initial state of the original machine. • Rewriting invariant [Brayton, IWLS’07] • Can be reduced to a SAT problem • Clock Period Preservation • A New Retiming-based Technology Mapping Algorithm for LUT-based FPGAs [Pan, FPGA’98] • Sequential arrival time: l-values
Experimental Results – Sequential vs. Combinational Resynthesis • Seq-resynthesis obtains up to 9% area reduction • Factors to affect seq-resynthesis • Sequential structure • All factors in combinational resynthesis
Outline • Background and Motivation • Combinational Resynthesis with MIMO Blocks • SAT-based Boolean Matching for Multiple Output Functions • Resynthesis Algorithm • Sequential Resynthesis • Conclusion and Future Work
Conclusions and Future Work • Proposed a new resynthesis considering bothMIMO blocks and retiming • Results indicate that sequential resynthesis obtainsmore gain than MIMO resynthesis • Future work • PLBs from [Ling, DAC’05] are optimal only for MISO, and we will develop new PLB structures for MIMO re-synthesis • Study the resynthesis for heterogeneous FPGAs
Thanks FPGA Area Reduction by Multi-Output Sequential Resynthesis Yu Hu, Victor Shih, Rupak Majumdar and Lei He