80 likes | 169 Views
Design, Synthesis and Evaluation of Heterogeneous FPGA with Mixed LUTs and Macro-Gates. Yu Hu 1 , Satyaki Das 2 Steve Trimberger 2 , and Lei He 1 1. Electrical Engineering Dept., UCLA 2. Research Lab, Xilinx Inc. Presented by Yu Hu Address comments to lhe@ee.ucla.edu.
E N D
Design, Synthesis and Evaluation of Heterogeneous FPGA with Mixed LUTs and Macro-Gates Yu Hu1, Satyaki Das2 Steve Trimberger2, and Lei He1 1. Electrical Engineering Dept., UCLA 2. Research Lab, Xilinx Inc. Presented by Yu Hu Address comments to lhe@ee.ucla.edu
Heterogeneous FPGA with Macro-Gates • There exists trade-off between programmability and cost (performance, area, power, etc.) • Xilinx V4 benefits from small gates (MUX2, XOR2) built in SLICEs. • Seek a small set of wider logic functions (macro gates) to replace a large portion of LUTs. • Reduce logic area and delay • What is missing? • Design: What should be inside these macro gates? • CAD: Need flexible Synthesis tools to evaluate the architecture!
f LUT ab’c’+a’bc’ / 1 / xx% ab’c’+a’bc’ / 1 / 75% abc/ 1 / 50% abc/ 1 / xx% g 1+1*2/3+1*1/3=2 1+1*1/3=1.33 LUT LUT d ab’+a’b / 1 / 50% ab’+a’b / 1 / xx% ab’+a’b / 0 / xx% ab / 0 / 25% ab / 0 / xx% F e 1*1/2=0.5 1+1*1/2=1.5 h a / 0 / xx% a / 1 / 25% a / 1 / xx% b LUT 1 a -0- / 0 / xx% -0- / 0 / xx% LUT c LUT Selection of Logic Functions for Macro-Gates 0000001000000000 0000010000000000 0000100000000000 0001000000000000 0010000000000000 0100000000000000 …… Map with LUT-N Extract logic functions Generate Utilization NPN Diagram Calculate score For logic functions Rank logic functions Best function: ab’c’+a’bc’
Proposed Macro-Gates and FPGA Architecture • For IWLS’05 benchmarks, the following four 6-input functions have the highest ranks • GI1=a b c d e f (AND-6) • GI2=a’ b’ c’ + b c f’ + b c’ d’ + b’ c e (MUX-4) • GI3=a b' c d' e + b c e f + d e f • GI4=a b' + a' c d' + b' c' + e' + f‘ • The architecture of the proposed macro-gate and FPGA slice are
Mapping: Resource Utilization Balancer • The available resource of different logics in an FPGA is fixed • Technology mapper should optimize logic resource utilization rate to minimize the packing area • A Binary Integer and Linear Programming is used to balance the logic resource utilization while preserving the timing
Mapping: SAT-Based Slice Packing • Formulate the slice packing problem as a localized place and route validation problem, which is solved by SAT: • Exclusively constraint: (¬X@A) ∨ (¬X@B) • Presence constraint: (X@A) ∨ (¬X@B) • Input/Output constraint: X@A → U5@N10 • Routing constraint: G0 →out ∧ U5@N10) → U5@N12 • More constraints in the paper …
f LUT g LUT LUT d F e LUT6 LUT6 MG6 MG6 MG6 h b LUT6 LUT a MG6 MG6 MG6 LUT6 MG6 MG6 MG6 LUT6 LUT c LUT6 MG6 MG6 MG6 LUT MG6 Overall Flow for Technology Mapping Area weight Setting Cut-based Mapping Y Area-Balance Trade-off? LUT-MG ratio balancer N packing
Architecture Evaluation • Four architectures are compared: • LUT4, LUT4 + macro gate, LUT6, and LUT6 + macro gate • Power and delay model • Based on transistor number • For IWLS’05 benchmark, mixing LUT and gates reduces delay and device area