340 likes | 470 Views
Placement Feedback: A Concept and Method for Better Min-Cut Placements. Andrew B. Kahng. Sherief Reda. CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093 abk@cs.ucsd.edu. CSE Department University of CA, San Diego La Jolla, CA 92093 sreda@cs.ucsd.edu.
E N D
Placement Feedback: A Concept and Method for Better Min-Cut Placements Andrew B. Kahng Sherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093 abk@cs.ucsd.edu CSE Department University of CA, San Diego La Jolla, CA 92093 sreda@cs.ucsd.edu VLSI CAD Laboratory at UCSD http://vlsicad.ucsd.edu
Outline • Min-cut Placement and Terminal Propagation • Ambiguous Terminal Propagation • Placement Feedback • Iterated Controlled Feedback • Accelerated Feedback • Experimental Results • Conclusions
Half-Perimeter Wirelength (HPWL) correlates well with the routed wirelength, represents a lower bound on the net length and fast to calculate Min-Cut Placement: Objective • Min-cut Placement Objective: Total wirelength minimization • Steiner tree represents the minimum wirelength need to connect a number of cells • Total wirelength is the sum of the length of Steiner trees • Routed wirelength is the typically larger than total wirelength due to detours arising from contention on routing resources
block Level 1 Level 2 • Key Issues: • How to partition a hypergraph? • Multilevel hypergraph partitioning using the Fiduccia/Mattheysesheuristic • How to propagate net connectivity information from one block to another? Min-Cut Placement: Method • Min-Cut Placement Method: Sequential min-cut partitioning Netlist (hyper-graph) block Input
2 1 1 1 B C 1 2 2 C D B C D A D A A B Case II Case I • Case I: Blocks are partitioned in isolation → optimal local partitioning results but far from optimal global results • Case II: Information about cells in one block are accounted for in the other block → local partitioning results are translated to global wirelength results Terminal Propagation A B C D After first placement level Simple hypergraph • Well-studied problem: • Terminal propagation (Dunlop/Kernighan85) • Global objectives/cycling (HuangK97, Zheng/Dutt00, Yildiz/Madden01)
uf Terminal Propagation Mechanism B1 B2 u v • B1 has been partitioned; B2 is to be partitioned • u is propagated as a fixed vertex ufto the subblock that is closer • ufbiases the partitioner to move v upward
Y2 Y1 partition fuzziness Y4 ? Y3 Ambiguous Terminal Propagation f2 f1 X f3 • Ambiguous propagation occurs when terminals, e.g. Y4, are equally close to the two subblocks of a block under partitioning • Traditional solution: either propagate to both subblocks or not to propagate at all
Terminal Propagation decisions (without ambiguous) 1. Only● → L 2. Only●→ R 3. ● and ● → neither Terminal Propagation decisions (with ambiguous) ● cells are closer to L than R 1. ● and●→ L or neither 2. ●and ●→ R or neither 3. ● ● and●→ neither 4.● → neitheror L or R ● cells are closer to R than L ● cells are equally proximate to both L and R Effect of Ambiguous Terminal Propagations L R Given an edge e with a set of cells I: Conclusion: Ambiguous propagations lead to indeterminism in propagation decisions → wirelength increase
The input to the flow is the I/O pad locations, and the circuit netlist where all are collapsed at the center of the chip • The output of the flow is a global placement, where groups of cells are assigned portions of the chip’s rows • A detailed placer determines the exact locations of all cells Min-Cut Placement Flow Terminal Propagation Level m Partitioning Level 2 Partitioning Terminal Propagation Level 1 Partitioning Terminal Propagation
Outline • Min-cut Placement and Terminal Propagation • Ambiguous Terminal Propagation • Placement Feedback • Iterated Controlled Feedback • Accelerated Feedback • Experimental Results • Conclusions
2 1 1 C is ambiguously propagated Further partitioning A A C X X Cuts = 3, Wirelength = 6 1 C B B Undo Repartition C is propagated to the top Further partitioning A A C A C C B B X C C Cuts = 2, Wirelength = 5 X X B Mitigating Ambiguous Terminal Propagation • Two hyperedges: {A, B, C}, {X, A, B}. B1 is partitioned before B2 B2 B1 A X C B
Traditional Placement Flow Terminal Propagation Level 1 Partitioning Terminal Propagation Level 2 Partitioning Terminal Propagation Level m Partitioning Placement Feedback For each placement level: - Undo all partitioning/block bisecting results, but retain the new cell locations for terminal propagations - Use the new cell locations to re-do the level’s placement Placement Flow with Feedback
Placement Feedback Assessment • Metrics: • Reduction in ambiguous terminal propagations • Associated reduction in HPWL • Experimental Setup • We implement feedback in Capo (version 8.7) • For each placement level: • Measure the number of ambiguous terminal propagations before and after feedback • Measure the HPWL estimate before and after feedback (assuming all previous placements levels had feedback)
Percentage reduction in HPWL Placement Level Feedback Effects Percentage reduction in ambiguous propagations Placement Level • Reductions in ambiguous terminals and HPWL per level are strongly correlated
Outline • Min-cut Placement and Terminal Propagation • Ambiguous Terminal Propagation • Placement Feedback • Iterated Controlled Feedback • Accelerated Feedback • Experimental Results • Conclusions
Iterative Placement Feedback Terminal Propagation Level 1 Partitioning Terminal Propagation Level 2 Partitioning Terminal Propagation Level m Partitioning Feedback Controller Placement Flow with Feedback Controllers • Since the feedback loop produces new outputs → iterate over the feedback loop a number of times • If the feedback response is not desirable → insert a feedback controller to enhance the response. Feedback controller should: • Evaluate and optimize some placement quality or objective • Decide when to terminate feedback iterating
Feedback Controller Objectives B1 • Two possible objectives (placement qualities) to optimize: d1 c1 • Cut partitioning objective:QP = c1 + c2 • HPWL objective:QH = c1 × d1 + c2× d2 B2 c2 d2 • QP and QH are not correlated! • Example: Assume d1 = 6 and d2 = 8 • c1 = c2 = 100 → QP = 200 and QH = 1400 • c1 = 85, c2 = 112 → QP = 197 and QH = 1406
Terminal Propagation Level 1 Partitioning Terminal Propagation Level 2 Partitioning Terminal Propagation Level m Partitioning Feedback Controller Placement Flow with Feedback Controllers • Monotonic Improvement Criterion: Iterate per placement level until there is no further improvement in QP(or QH) QP or QH Iteration 0 1 2 3 4 5 Feedback Controller Stopping Criteria
Terminal Propagation Level 1 Partitioning Terminal Propagation Level 2 Partitioning Terminal Propagation Level m Partitioning Feedback Controller Placement Flow with Feedback Controllers QP or QH Iteration 0 1 2 3 4 5 Feedback Controller Stopping Criteria B. Best Improvement Criterion: Iterate per placement levela fixed number of times but pass the best results seen QP(or QH)
Terminal Propagation Level 1 Partitioning Terminal Propagation Level 2 Partitioning Terminal Propagation Level m Partitioning Feedback Controller Placement Flow with Feedback Controllers QP or QH Iteration 0 1 2 3 4 5 Feedback Controller Stopping Criteria C. Unconstrained Criterion: Iterate per placement levela fixed number of times and pass the last results
Terminal Propagation Level 1 Partitioning Terminal Propagation Level 2 Partitioning Terminal Propagation Level m Partitioning Feedback Controller Placement Flow with Feedback Controllers Controller Type Comparison • Combinations of the 3 stopping criteria and 2 objectives yield 5 controllers • We study the aggregate impact of the different controllers on the final HPWL
Effect of Controller on Final Wirelength Final HPWL versus number of iterations for different controllers Monotonic QH Best QH Monotonic QP Best QP Unconstrained Iteration • QP (based on partitioning) controllers dominate QH (based on HPWL) controllers • Best Improvement controllers outperform monotonic improvement controllers • Best Improvement QP controller slightly outperforms the unconstrained controller
Asymptotic Controller Behavior Final HPWL versus number of iterations for different controllers Best QP Iteration • Results are average of 6 seeds for up to 12 iterations using the best improvement QP controller • Final value slightly oscillates around a fixed value with a 8-9% improvement in HPWL in comparison to traditional placement flow
Accelerated Feedback Coarsening Uncoarsening V Cycle • Feedback runtime αnumber of feedback iterations • Typically, placers call the multilevel partitioner a number of times and utilize the best cluster-tree partitioning results • In iterated feedback, only the last feedback iteration determines the partitioning results; other loops determine accurate terminal propagation. To speedup our feedback implementation: → Call the multi-level partitioner once (1 V-Cycle) for each feedback loop → Restore to default placer settings (2 V-Cycles) for the last feedback iteration
Outline • Min-cut Placement and Terminal Propagation • Ambiguous Terminal Propagation • Placement Feedback • Iterated Controlled Feedback • Accelerated Feedback • Experimental Results • Conclusions
Experimental Setup • We test our methodology in Capo version 8.7 • Cadence’s WarpRoute is used for routed wirelength evaluation • Placement results are average of 6 seeds • Code implementation took 130 lines of C++ code • All experiments conducted on 2.4 GHz Xeon Linux workstation, 2 GB RAM • We evaluate feedback on the IBM version 1, version 2, and PEKO benchmarks
HPWL Results (IBM Version 1) • We use 3 feedback iterations with the best improvement Qp feedback controller % AFB FB Percentage improvement in HPWL (Half-Perimeter Wirelength) in comparison to Capo
HPWL Results (IBM Version 1) • Accelerated Feedback: Max improvement 13.43% and average improvement 4.70% with 2.43x the original Capo runtime • Feedback: Max improvement 13.73% and average improvement 5.43% with 4.10x the original in Capo runtime • PEKO benchmarks: Max improvement 10% and average improvement 5% for feedback at the expense of 2-3x increase in Capo runtime
Routed Wirelength Results (IBM Version 2 - Hard) % Percentage improvement in routed wirelength in comparison to Capo Number of routing violations
Routed Wirelength Results (IBM Version 2 - Easy) % Percentage improvement in routed wirelength in comparison to Capo. Number of routing violations
Conclusions • New understanding of how ambiguous terminal propagation leads to indeterminism in propagation results and degraded placer performance • Idea: reduce indeterminism by undoing placement results, but still using them to guide future partitioning. • Flavors of this approach proposed before, but for different contexts • Our approach is captured as feedback, which we tune using controllers • Detailed study of variant objectives that can be optimized by the controllers, as well as iterating criteria • Accelerated feedback: efficient implementations to reduce runtime impact • IBMv1 HPWL results: up to 14% (best) and 6% (avg) improvement over Capo • IBMv2 routed WL results: up to 10% improvement over Capo, with improved routability and reduced via count • Accelerated feedback is now the default mode in Capo
Acknowledgments We thank Igor Markov (University of Michigan) for helpful discussions.
1 2 3 4 Block Ordering • Regular ordering • Random ordering • Alternate ordering Results are inconclusive!