300 likes | 412 Views
Evaluation of Logic by Software Using BDDs. Andrew Mihal EE219B Spring 2000 5/16/2000. Outline. Problem Statement Simple Approach Table-based BDD Approach Branch-based BDD Approach BDD Visualization Clumping Algorithms. Problem Statement. Given:
E N D
Evaluation of Logic by Software Using BDDs Andrew Mihal EE219B Spring 2000 5/16/2000
Outline • Problem Statement • Simple Approach • Table-based BDD Approach • Branch-based BDD Approach • BDD Visualization • Clumping Algorithms
Problem Statement • Given: • A multilevel combinational network in BLIF format • Generate: • A software function that evaluates the network • void eval_network(int PI[], int PO[]);
POs Rank 3 Rank 2 Rank 1 Rank 0 PIs Simple Approach • Perform topological sort on nodes • Output each node as a C assignment using boolean operators in SOP form void eval_network(PI, PO){ int I[6]; I[0] = PI[1] & PI[2]; I[1] = ( & ) | ( & ); ... PO[0] = I[4]; PO[1] = I[5]; return; }
Simple Approach • Pros: • Very simple to program • No control flow • Potential for compiler optimizations • Cons: • Q(n) in time and space • Not sophisiticated
Benchmarks • Input BLIF Files (mcnc91): • 1 to 3500 nodes 9symml apex7 comp i5 sct C1355 b1 cordic i6 small C17 b9 count i7 t C1908 c8 cu i8 t481 C2670 cc dalu i9 tcon C3540 cht decod k2 term1 C432 cm138a des lal too_large C499 cm150a example2 majority ttt2 C5315 cm151a f51m mux unreg C6288 cm152a frg1 my_adder vda C7552 cm162a frg2 pair x1 C880 cm163a i1 parity x2 cm42a i10 pcle x3 x4 alu2 cm82a i2 pcler8 z4ml alu4 cm85a i3 pm1 rot apex6 cmb i4
Benchmarks • gcc -O3 -pg • Pentium-class machine • Each network tested with 100,000 random input vectors (deterministic) • Measure average time spent in each eval_network call • Scripts used to run tests and gather statistics
Table-based BDD Approach • Instead of statements of the form: • I[1] = (PI[0] & PI[1]) | I[0]; • Use BDDs instead • I[1] = eval_bdd(bdd_1, PI[], I[]); • Assume we have an efficient eval_bdd function • Statements are still in topological order • bdd_1 is a constant hardcoded table • BDD ordering with sift
I[0] I1 I1 = (PI[0] · PI[1]) + I[0] PI[0] PI[1] PI[0] PI[1] I[0] 1 Table-based Approach • Building a network node into a table const bddn bdd_1 = { {INT, I[0], 1, POS, 3, POS}, {INT, PI[0], 3, NEG, 2, POS}, {INT, PI[1], 3, NEG, 3, POS}, {CONSTANT_1} };
Table-based Approach • Pros: • BDD may be more efficient than SOP form • Data hardcoded into program • All we need to write is eval_bdd function • Cons: • Compiler doesn't optimize hardcoded data • eval_bdd function is inefficient • Function call overhead • BDD data table indexing
Branch-based Approach • Get rid of tables and eval_bdd function calls • Replace eval_bdd statements with inline code • Still use topological sort
I[0] I1 PI[0] PI[1] PI[0] PI[1] I[0] 1 Branch-based Approach void eval_network(int PI[], int PO[]){ int I[6]; int complement; ... NODE_1_START: complement = 1; NODE_1_0: if (I[0]) goto NODE_1_3; else goto NODE_1_1; NODE_1_1: if (PI[0]) goto NODE_1_2; else { complement ^= 1; goto NODE_1_2;} NODE_1_2: if (PI[1]) goto NODE_1_3; else { complement ^=1; goto NODE_1_3;} NODE_1_3: I[1] = complement; ...
Branch-based Approach • Pros: • No table lookups • No function calls • goto compiles straight to a simple jump • Cons: • Performance?
BDD Visualization • Instead of emitting BDDs as tables or branch structures, produce a graph • Uses DOT, a graph drawing tool from AT&T
F = A X + B A X = C D B Clumping Algorithms • Can we improve performance by making BDDs larger? • Clumping: Collapse a node into its fanouts, removing it from the network F = A C D + B A B
Clumping Algorithms • Two different heuristics • Input clumping • Tries to make all BDDs have about N inputs • Greedy algorithm • Size clumping • Tries to make all BDDs have about N nodes • Greedy algorithm
Clumping Issues • Number of nodes decreases, but BDD size increases • Average number of BDD nodes we evaluate stays the same? • Synthesis and compile time very long and very memory intensive when using clumping • Flat method synthesizes and compiles quickly, and scales to larger networks