300 likes | 446 Views
Task 1091.001: Highly Scalable Placement by Multilevel Optimization. Task Leaders: Jason Cong (UCLA CS) and Tony Chan (UCLA Math) Students with Graduation Dates: Michalis Romesis (UCLA CS, March 2005 ---graduated) Kenton Sze (UCLA Math, July 2006 --- graduated)
E N D
Task 1091.001: Highly Scalable Placement by Multilevel Optimization • Task Leaders: Jason Cong (UCLA CS) and Tony Chan (UCLA Math) • Students with Graduation Dates: • Michalis Romesis (UCLA CS, March 2005 ---graduated) • Kenton Sze (UCLA Math, July 2006 --- graduated) • Min Xie (UCLA CS, September 2006 --- graduated) • Guojie Luo (UCLA CS, September 2010) • Research Staff: Joe Shinnerl, UCLA CS
Industrial Liaisons • Patrick McGuinness, Freescale Semiconductor, Inc. • Natesan Venkateswaran, IBM Corporation • Amit Chowdhary, Intel Corporation UCLA VLSICAD LAB
Task Description and Anticipated Result • Highly scalable multilevel, multiheuristic placement algorithms that address the critical placement needs of nanometer designs: • scalability • multi-constraint optimization --- timing, routability, power, manufacturability, etc. • support of mixed-sized placement and incremental design. • Quantitative study of the optimality and scalability of placement algorithms • Construction of synthetic benchmarks with known optima to identify the deficiencies of existing methods • Our goal is to achieve one-process-generation benefit through innovation of physical-design technologies, especially placement. UCLA VLSICAD LAB
Task Deliverables • Report on new placement benchmarks with known optimal or near optimal solutions for all major objectives and constraints. Scalability and optimization studies on existing placement techniques (Completed 3-Nov-2003) • Experiments and reports on the applicability of integrated AMG-based weighted aggregation and weighted interpolation. Improvement measured on both PEKO examples and industrial examples from SRC member companies (Completed 1-Jun-2004) • Experiments and reports on multiheuristic, multilevel relaxation and the scalable incorporation of complex constraints into the enhanced multilevel framework. Improvement measured on both PEKO and industrial examples (Completed 1-Jun-2005) • A highly scalable placement tool that (i) supports multi-constraint optimization, mixed-sized placement, and incremental design and (ii) produces best-of-class results for both PEKO and industrial examples from SRC member companies (Completed 1-Jun-2006) • Final report summarizing research accomplishments and future direction (Planned-Oct-31, 2006) UCLA VLSICAD LAB
Accomplishments in the Past Year • Improvements in mPL for routing density control [Best quality, ISPD 2006 contest] • Thermal-Driven Placement • Heterogeneous Placement UCLA VLSICAD LAB
Relative Wirelength A Brief History of mPL mPL 1.0 [ICCAD00] • ESC Clustering • Goto relaxation UNIFORM CELL SIZE mPL 1.1 • FC clustering • Partitioning added to legalization mPL 2.0 • RDFL relaxation • Primal-dual netlist pruning mPL 3.0 [ICCAD03] • QRS relaxation • AMG interpolation • Multiple V cycles mPL 5.0 • Multilevel force directed • Mixed-size capability mPL 4.0 • Improved DP • Backtracking V cycle NON-UNIFORM CELL SIZE mPL 6.0 • Enhanced Routability handling 2002 2003 year 2000 2001 2004 2006 2005 UCLA VLSICAD LAB
is a generalized force mPL: Generalized Force-Directed Placement • Use of accurate objective functions [Bertsekas, 82, Naylor et al, 01] • Optimization-based bin-density constraint formulation • Iterative Uzawa solver • Multilevel for better runtime and wirelength UCLA VLSICAD LAB
Accomplishments in the Past Year • Improvements in mPL for routing density control [Best quality, ISPD 2006 contest] • Thermal-Driven Placement • Heterogeneous Placement UCLA VLSICAD LAB
Initial Finest Problem Final Placement coarsening interpolation coarsening interpolation coarsening interpolation Coarsest Problem Core Engine for Density Control • Overall scheme • One V cycle with comparable quality • Minimum perturbation in the last stages of GFD • Significant speed up without losing solution quality • Routing density handling • Residual density in each bin • Even distribution of dummy density into bins • Cell area inflation for better convergence GFD with Density Control Minimun perturbation UCLA VLSICAD LAB
Macro Spreading • Need area density below target value [Nam, ISPD06] • Target distance between neighboring macros • : target density • Spreading represented as objective H A1 w2 w w1 A2 W fij • dxi and dyi : perturbation • fxij and fyij : piece-wise linear function x Hij UCLA VLSICAD LAB
Experiment Results on ISPD06 mPL6 produces the best solution quality using ISPD06 routability-driven metric UCLA VLSICAD LAB
Demonstration of mPL6 http://cadlab.cs.ucla.edu/cpmo/videos/mPL6-density.wmv UCLA VLSICAD LAB
Accomplishments in the Past Year • Improvements in mPL core engine for mixed-size global placement • Thermal-Driven Placement • Heterogeneous Placement UCLA VLSICAD LAB
Motivation • High power density due to technology scaling • Problems caused by high temperature • Hot spots become more harmful • Higher temperature Higher leakage power More heat • Previously negligible effects become first-order effects • Difficult estimation for power, timing, etc UCLA VLSICAD LAB
P Cxy Tj,1 Cz Tj,4 Ti Tj,2 Tj,3 Tsink Thermal Model • One layer mesh to model the substrate • Σj (Ti - Tj) Cxy + (Ti – Tsink) Cz = Pi • Cxy, Cz are the thermal conductance for the substrate and the heat sink • Solved by Fast DCT • Solve T from CT = P, given C and P • Diagonalize C = ΓTΛΓ • Γ is the discrete cosine matrix • Λ is a diagonal matrix • T = Γ-1Λ-1Γ P UCLA VLSICAD LAB
Formulation & Solution • Implement i(x) and ti(x) with filler cells and “filler power” without area • Tdes is a given by user • Solved by Uzawa Algorithm • As additional thermal-aware GFD following a WL-driven V-Cycle UCLA VLSICAD LAB
Experiment Results on IBM-FastPlace • Quality improvement • Teven is the ideal temperature with the same total power • Max. on-chip temperature: • Tinit after Step 1 • Tfinal = Tdes after Step • More than 90% quality improvement within 5% WL increase UCLA VLSICAD LAB
Accomplishments in the Past Year • Improvements in mPL for routing density control [1st quality, ISPD 2006 contest] • Thermal-Driven Placement • Heterogeneous Placement UCLA VLSICAD LAB
Motivation • Need for placement on array type chips with pre-fabricated resources • FPGA • Structured ASIC • Need for heterogeneous capability • Memory, DSP, etc • Block on sites of the same type UCLA VLSICAD LAB
Related Work • Academia • VPR [Betz & Rose 97], PATH [Kong 02], SPCD [Chen & Cong 04,05], PPFF [Maidee et al, 03], CAPRI [Gopalakrishnan et al, 06] • Most comparisons to out-dated tools • No heterogeneous capability • Industry • Quartus II [Altera Corp.], ISE [Xilinx Inc.] • Proprietary chips only • Techniques not publicly documented UCLA VLSICAD LAB
Heterogeneous Placement by mPL-H • First analytical placer for heterogeneous placement • Framework based on mPL6 [Chan et al, 05] • Multiple layered placement • One logical layer for each resource • Forbidden regions blocked by obstacles • Uniform wirelength computation • Filler cells on each layer DSP M-RAM LAB UCLA VLSICAD LAB
Demonstration of mPL-H http://cadlab.cs.ucla.edu/cpmo/videos/mPL-H.wmv UCLA VLSICAD LAB
Experiment Setting Verilog netlist Quartus_map Clustered .vqm netlist Stratix Description .xml Quartus_fitter mPL-H Chip type .qsf placement Quartus_router UCLA VLSICAD LAB
Wirelength Comparison • WL still important for architecture evaluation • mPL-H is 3% better in HPWL, and 2% better in routed WL than Quartus II v5.0 UCLA VLSICAD LAB
Runtime Comparison • mPL-H can be 2X faster than Quartus II v5.0 when the circuit becomes sufficiently large UCLA VLSICAD LAB
Overall Accomplishments Over the Funding Period • 34% reduction in WL over 3 years • One technology generation advancement UCLA VLSICAD LAB
Technology Transfer in 2006 • Discussions at conferences and workshops • ASPDAC 2006, Yokohama, Japan • ISPD 2006, San Jose, USA • DAC 2006, San Francisco, USA • Benchmark Releases (PEKO-MS) http://cadlab.cs.ucla.edu/~pubbench • mPL release: http://cadlab.cs.ucla.edu/src_686_mpl/ UCLA VLSICAD LAB
Software Download Record • PEKO/PEKU [2002 – now] • More than 360 downloads… • SRC member companies • Cadence, IBM, Intel, Mentor Graphics,…etc. • NON-SRC member companies • Synopsys, Magma, Monterey Design, etc. • Universities • CMU, Michigan, MIT, UC Berkeley, UCSD, …etc., • mPL [2001 – now] • More than 480 downloads… • SRC member companies • Cadence, Intel, Mentor Graphics,…etc. • NON-SRC member companies • Synopsys, Magma, Intrinsity, Oasys, etc. • Universities • CMU, Michigan, Stanford, UCSD, Nat’l Taiwan U., …etc., UCLA VLSICAD LAB
Publications in 2006 • Conference papers • ASPDAC 2006: J. Cong, M. Xie, “A Robust Detailed Placement for Mixed-size IC Designs.” • ISPD 2006:T. F. Chan, J. Cong, J. Shinnerl, K. Sze and M. Xie, “mPL6: Enhanced Multilevel Mixed-size Placement.” • Thesis • Kenton Sze, “Multilevel Optimization for VLSI Circuit Placement.” • Min Xie,“Constraint-Driven Large Scale Circuit Placement Algorithms.” UCLA VLSICAD LAB
Room for Further Improvement? mPL4 mPL5 • “Swirls” are difficult to correct with localized refinement UCLA VLSICAD LAB