560 likes | 731 Views
2013 CAD Contest Technology Mapping for Macro Blocks. Team: WCYLab -Bach Ching -Yi Huang, Wei-An Ji , Yu-Min Chou, Zheng -Shan Yu Date: 2013/7/22. Outline. Problem Formulation Framework & Flow Logical macro mapping Continuous AND/NAND/OR/NOR/XOR/XNOR Arithmetical macro mapping
E N D
2013 CAD Contest Technology Mapping for Macro Blocks Team: WCYLab-Bach Ching-Yi Huang, Wei-An Ji, Yu-Min Chou, Zheng-Shan Yu Date: 2013/7/22
Outline • Problem Formulation • Framework & Flow • Logical macro mapping • Continuous AND/NAND/OR/NOR/XOR/XNOR • Arithmetical macro mapping • Adder • Others • Framework • Cutting • Matching • Progress & Future Work
Framework (I/O) ABC D D.blif Mapping ABC D’.blif D’.v Library VTR L.blif
design.v lib.v Verilog Parser blif AIG Resyn2 Optimize NMG+NAR Yes 1. node# decrease? Continuous AND OR map No Algorithm 2. Adder 3. Mux 4. K cut Lazy man Output Standard cell mapping Out.v
Framework & Flow • Classify the libraries into two categories • For logical macro blocks • Search for continuous AND/NAND/OR/NOR • Search for continuous XOR/XNOR • For arithmetic macro blocks • Adders • Mux • Others • Partition the multi-PO circuit into single-PO TFICs • Only deal with the K-cuts sub-circuits, where K < 13 • Map the cuts using Lazy Man’s method • Greedily cut & match the macros • Standard cell technology mapping • Transform the AIG to the normal gate-level circuit • Isolate the mapped macro blocks at the same time
Outline • Problem Formulation • Framework & Flow • Logical macro mapping • Continuous AND/NAND/OR/NOR/XOR/XNOR • Arithmetical macro mapping • Adder • Others • Framework • Cutting • Matching • Progress & Future Work
Continuous AND Gates • Use DFS and Disjoint Set to do the searching in the netlist • Consider • No fanout in the cone • Maximum input number of the macro PI-1 DFS Target PI-2
Continuous NAND/OR/NOR Gates • Use DFS and Disjoint Set to do the searching in the netlist • Consider the locations of inverters Cont. NAND Cont. OR Cont. NOR Target Target Target PI-1
Issue • Fanout-reconverge Target
Issue • Fanout-reconverge Target
Issue • Fanout-reconverge Target
Issue • Fanout-reconverge PI-1 Target n1 n2 PI-2
Issue • Fanout-reconverge • Duplicate PI-1 n1 n2 PI-2
Issue • Fanout-reconverge • Duplicate PI-1 Target PI-1 n1 n2 PI-2
XOR/XNOR Gate • Brute-force analysis • 3 structures for XOR • 3 structures for XNOR
Continuous XOR gates • How to record? • Super gate & DFS & Disjoint Set DFS Save as super gate Union & Save
Issue • Cont. AND vs. Cont. OR/NOR/NAND • If only 1 library appears – no problem • If both appear ? • AND or NOR? • Overlaps between XOR and cont. AND/… ? Target
Outline • Problem Formulation • Framework & Flow • Logical macro mapping • Continuous AND/NAND/OR/NOR/XOR/XNOR • Arithmetical macro mapping • Adder • Others • Framework • Cutting • Matching • Progress & Future Work
Two direction of mapping • True macros • Gate reusing
Adder 0 • Full adder • structural mapping 1 7 1 1 8
Adder Sum Cout HA HA Truth Table Sum Cout 3-cut HA
Adder 3-bit Adder Cin Truth Table Cin Cout Sum Cout 3-cut
(A0 A2) + (A1 A2) A0 xnor A1 (A0+A1) xor A2 A1’
A0’ 1 0 A0 1 A1’ xor A0 = A1 xnor A0 A1’A0+A1 = (A0+A1) 0 A2 xor (A0+A1) A2 and (A0+A1) = A0A2+A1A2 A3 xor (A0A2+A1A2) = A3’(A0A2+A1A2)+A3(A0A2+A1A2)’ 0 A3 and (A0A2+A1A2)
Adder 0 HA Cout 0 0 0 0
Adder A0 B0 . . An Bn 0 0 A0 B0 . An Bn 0 0 True Full Adders Sum0~n Cout Sum0~n Cout Structural gate reusing by (0,1) insertion A0 1 A1 0 0 0 A0 1 A1 1 A2 0 0 0 Sum0~1 Cout Sum0~2 X A0 B0 0 0 A0 B0 0 0 Only Half Adders Sum0 Cout Sum0 X
Mux • 4-bit mux 3 In1[0] In1[1] 2 4 In2[0] 1 In2[1] In2[2] In2[3]
Mux • 4-bit mux 3 In1[0] In1[1] 2 4 In2[0] 1 0 In2[1] In2[2] 0 In2[3]
Mux A B C MUX F
Adder3+Adder4 ?A+B+C or (A+B)+C ? Sum Cout1 Cout2 Sum Cout1 Cout2 Cin2 Cin1 Cin1 Sum Cout1 Cout2 Example of A+B+C Too complicated!? 5-cut
Multiplier Level 1st Level 2nd
Multiplier • Reusing? 0 A2 A1 A0 B3 0 0 B0 × 0 A2B0 A1B0 A0B0 0 A2B3 A1B3 A0B3
4-bit Adder3 A3 A2 A1 A0 B3 B2 B1 B0 C3 C2 C1 C0 A3 A2 A1 A0 B3 B2 B1 B0 C3 C2 C1 C0 + + A3 +B3 A2 +B0 A1 +B1 A0 +B0 A3 +B3 A2 +B2 A1 +B1 A0 +B0 HA +C3 +C2 +C1 +C0 +C3 +C2 +C1 +C0 FA FA FA HA
4-bit Adder4 A3 A2 A1 A0 B3 B2 B1 B0 C3 C2 C1 C0 D3 D2 D1 D0 A3 A2 A1 A0 B3 B2 B1 B0 C3 C2 C1 C0 D3 D2 D1 D0 + + A3 +B3 A2 +B0 A1 +B1 A0 +B0 A3 +B3 A2 +B2 A1 +B1 A0 +B0 HA +C3 +C2 +C1 +C0 +C3 +C2 +C1 +C0 HA HA +D3 +D2 +D1 +D0 HA +D3 +D2 +D1 +D0 HA FA FA FA HA
Mux • 8-bit mux In1[0] In1[1] In1[2] In2[0] In2[1] In2[4] In2[2] In2[5] In2[3] In2[6] In2[7]
Mux • 4-bit mux In1[0] In1[1] 0 In2[0] In2[1] 0 In2[2] 0 In2[3] 0 0
Framework • 1. Partition POs into PO macros Macro Macro Macro Macro PO macro 1 PO macro 2 PO macro 3
Framework • 2. For each PO’s TFIC (PO macro), determine the PIs with constraint K = 12, 11 ,10, … , 3 , 2, and produce sub-macros • E.g. K=12, which PIs will be selected? • Exhaustive (CNK < 64) /Random Sub-macro 1 Sub-macro 2
Framework • 2. For each PO macro, determine the PIs with constraint K = 12, 11 ,10, … , 3 , 2 • E.g. K=8 Sub-macro 3 Sub-macro 4
Framework • 3. For each sub-macro, after selecting the PIs, assign the constant values to create mini-macros • Random? • How many? 0, 1? 0, 1? Mini-macro 0, 1? 0, 1?
Framework • 3. For each mini-macro, after selecting the PIs, assign the constant values to create mini-macros • Rule (1-level) • Logic implication? • How many random? • Exhaustive (< 26=128) + random 1 0 mini-macro Random 0, 1 Random 0, 1
Framework • 4. Cut-and-Map • 5. Select the best mapping result in each PO macro • Greedy: cut-and-map from larger mini-macros to smaller mini-macros • Among the mini-macros with the same K constraint, select the best one and go to 6 if some mini-macros can be mapped • 6. Greedily select the mapping results among all PO macros Macro Macro Macro 20 8 35
Framework • 6. Greedily select the mapping results among all POs’ TFIC • Only consider disjoint PO macros Macro 1st 35 2nd 28
K cut • Leave number: 3 <= K <= 12 • Mode 1: return all cuts whose K <= 12 • Mode 2: ignore covered cuts • Max cut per node: 1000 • Result:
Matching • Lazy man’s semi-canonical form Input: TruthTable F Determine the polarity of F by the number of 1’s in TruthTable Determine the polarity of each variable by the number of 1s in the negative cofactor w.r.t. each variable Sort input variables by the number of 1s in their negative cofactors and permute inputs accordingly Output: canonicizedTruthTable F
Isolation between macros and other standard cells • Trick Macro Macro PPOs PPIs