270 likes | 466 Views
Thermal Via Allocation for 3D ICs Considering Temporally and Spatially Variant Thermal Power. Tanay Karnik Circuit Research Lab Intel, USA. Hao Yu, Yiyu Shi and Lei He Electrical Engineering Dept. UCLA, USA. Partially supported by NSF and UC-MICRO fund from Intel. Heat Sources.
E N D
Thermal Via Allocation for 3D ICs Considering Temporally and Spatially Variant Thermal Power • Tanay Karnik Circuit Research Lab Intel, USA • Hao Yu, Yiyu Shi and Lei He Electrical Engineering Dept. UCLA, USA Partially supported by NSF and UC-MICRO fund from Intel
Heat Sources Active Layer Vias Inter-Layer Heat Sources Heat-sink New Solution for High-performance Integration • 2D SoC design has limited density and interconnect performance • Potential solution: 3D Integration [Banerjee-Saraswart:IEEE’01] • Fabrication Technologies: Chip-level Wafer Bonding or Die-level Silicon Epitaxial Growth • Inter-layer via plays a crucial role in signaling, power delivery and heat-removal
150c 135c 100c 70c 40c Thermal Challenges in 3D ICs • Temperature increases along third dimension • Inter-layer dielectric layers are poor thermal conductors • High temperature affects interconnect and device reliability and leads to variations to timing • Thermal analysis and thermal-aware design for 3D ICs becomes a need
Via Planning Problem • Motivation • Inter-layer vias are good thermal-conductor to remove heat • Inter-layer vias take additional chip area and routing resource • Previous work • Iterative via planning during placement [Goplen-Sapatnekar:ISPD’05] • Multilevel alternating direction via planning during routing [Zhang-Cong:ICCAD’05] • Both use steady-state analysis and assume a maximum-thermal power, and may lead to over-design • Primary contributions of our work • Minimize a thermal violation integral considering transient temperature • Develop an efficient sensitivity-driven sequential programming with use of macromodel
Outline • Background and Problem Formulation • Structured and Parameterized Macromodel • Sequential Optimization • Experimental Results • Conclusions
Thermal Model Overview • Electric and thermal duality • Electric and thermal systems can be described in MNA (modified nodal analysis) equation • Via conductance gi and capacitance ci are both proportional to size Aior density (Ai/a) (a is unit via area) • It can be parametrically added into MNA equation
Steady-state temperature can be obtained by directly solving a time-invariant linear equation Steady State Model and Analysis • Active-device and inter-dielectric layers are discretized into tiles • Tiles connected by thermal resistance • Heat sources modeled as time-invariant current sources R
Transient temperature can be obtained by solving a time-variant linear equation Transient Model and Analysis • Tiles connected by thermal resistance and thermal capacitance • Heat sources modeled as time-variant current sources RC
Thermal Power Variation and Analysis • Different workloads and dynamic power management introduces temporally and spatially power variations • Thermal power is the runtime averaging of cycle-accurate power, and is not a constant spatially and temporally • Steady-state analysis needs to assume a maximum thermal power simultaneously for all regions • It seldom happens that each part of the chip achieves their maximum simultaneously, and can result in an over-design • Transient analysis is accurate but time-consuming • It calls for more accurate yet efficient transient thermal simulation during the design automation
Thermal Violation Integral • Thermal violation is temperature overshoot for a long enough period, so maximum temperature is not a good Figure of Merit (FOM) • Thermal-violation integral as FOM fk(A) is more accurate • Time-domain transient temperature (y) integral over defined ceiling temperature (Tceiling) for a long enough period (t0 ~ tp) at ith tile • FOM f(A) for a group (K) of critical tiles A is a via density vector
Problem Formulation • Find a via density vector A to minimize the thermal violation integral under global/local routing congestion constraints • Two keys to efficiently solve this problem • Efficient models to transient response, and its first-order and second-order sensitivity with respect to via density • Efficient yet effective mathematic programming Global constraint Local constraint
Outline • Background and Problem Formulation • Structured and Parameterized Macromodel • Sequential Optimization • Experimental Results • Conclusions
Macromodel by Moment Matching … … small linear network large linear network • Krylov-subspace based projection can reduce model size and preserve accuracy by matching moments of inputs[Odabasioglu-Celik-Pileggi:TCAD’98] • Flat projection does not preserve block matrix structure such as sparsity • Reduced macromodel does not contain sensitivity information for design automation
1 2 3 4 5 6 7 8 8 4 3 7 0 1 - 1 1 5 6 2 0 0 1 2 3 4 5 6 7 8 X(2,6)= 0 1 -1 0 0 Parameterization (I) • The inserted location is described by adjacent matrix X • The via density (Ai) is parameterized and added into MNA Need to separate sensitivity from nominal response
Parameterization (II) • Expand state variables x(A1,…AK,s) by Taylor expansion w.r.t. Ai (up to second order) • x^(0), x^(1), and x^(2) are nominal values, first-order and second-order sensitivities • Expanded system has lower-triangular structure • System size is enlarged and needs to be reduced by projection • Traditional flat projection can not separate the nominal state variables and their sensitivities [Li-Pileggi:ICCAD’04] • This can be solved by a structure-preserved projection [Yu-He-Tan:BMAS’05]
Structured Projection (I) • Block-diagonally partition the projection matrix by the size of nominal state-variable, first-order sensitivity, and second-order sensitivity • Use structured projection can result in a reduced triangular system with nominal value and sensitivities to be solved independently
Structured Projection (II) • Time-domain transient response can be solved using Backward-Euler method • Nominal response, and sensitivity can be solved separately and efficiently • The reduced model is sparse • There is only one LU-factorization of the reduced diagonal block G0+(1/h)C0 • Generated sensitivities can be used in any gradient based optimization
Outline • Background and Problem Formulation • Structured and Parameterized Macromodel • Sequential Optimization • Experimental Results • Conclusions
Sequential Approximation of Objective Function • The objective function f(A) could be approximated • Find (ΔA) to minimize flp or fqp during each step • The objective function becomes semi-definite when integration is approximated by a discretized summation [Visweswariah:TCAD’00] • Sequential programming converges for convex-programming problems, and still has good convergence in semi-definite problems
Sensitivity Calculation • Direct sensitivity calculation for objective function • Structured and parameterized reduction provides an efficient calculation of both nominal value and sensitivity • The via density vector A can be efficiently updated during each iteration • The computation cost could be further reduced when an adjoint Lagrangian method is used to calculate sensitivity[Visweswariah:TCAD’00]
Outline • Background and Problem Formulation • Structured and Parameterized Macromodel • Sequential Optimization • Experimental Results • Conclusions
Experiment Settings • A modest 3D stacking with 1-heat-sink, 2-die-layer, 2-dielectric-layer is assumed, each extracted as RC mesh interconnected by RC-pair for via • Clock gating is assumed with a period of 250ms • Reduction algorithm assumes SIMO (single-input-multiple-output) reduction when the number of inputs is large • Compare our method (SP-Macro) with Steady-state solution
Accuracy of Reduced Macromodel • Transient temperature responses of exact and SP-MACRO models at port 3, 18, and 58 of top layer with step-response input • The responses of macromodels are visually identical to those exact models
Optimization Profile by SQP • Temperature reduction at selected location during the procedure of via-allocation by SQP • The allocated via results in a transient temperature meeting the targeted ceiling temperature 52C
Temperature Map • Temperature maps before and after the via allocation at the top layer • The maximum temperature before allocation is about 150C • The temperature after allocation meets the targeted ceiling temperature 52C
Allocated-via and Runtime Comparison • Compared to steady-state solution • SP-MACRO has smaller simulation and planning time when increasing circuit size • It reduces the runtime by 126X • SP-MACRO is more accurate to predict the via insertion • It reduces the inserted via number by 2.04X
Conclusions • Via planning based on the transient thermal analysis reduces via umber by 2.04x compared to the steady-state thermal analysis • An efficient via planning algorithm is developed • Structured and parameterized model reduction provides both nominal values and sensitivities • Sequential linear/quadratic programming minimizes the thermal-violation integral • SP-MACRO is further extended for • Simultaneous power and thermal integrity driven via planning [Yu-Ho-He:ICCAD’06]