300 likes | 462 Views
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) Enrico Macii EDA GROUP POLITECNICO DI TORINO enrico.macii@polito.it. Outline. A quick tour of the EDA group Why temperature effects are important in DSM design Factors contributing to temperature increase
E N D
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) Enrico Macii EDA GROUP POLITECNICO DI TORINO enrico.macii@polito.it
Outline • A quick tour of the EDA group • Why temperature effects are important in DSM design • Factors contributing to temperature increase • Temperature-aware design methodologies • Temperature effects in clock distribution networks • Thermal resilient clock tree (TRTC) design • Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Conclusions
A Quick Tour of the EDA Group (as of June 2006) Part of the Computer Engineering Department 3 Faculty Members: Enrico Macii -- Full Professor Massimo Poncino -- Associate Professor Alberto Macii -- Associate Professor 6 PhD Students 9 Research Assistants 3 Undergrad Students (http://eda.polito.it)
Topics under investigation • Leakage-aware design: • Automatic leakage model generation. • Synthesis of power-gated circuits. • On-going Intelp roject under SRC project – G. Kamhi. • ABB – Adaptive Body Biasing. • Cache and memory architecture synthesis. • Power optimization under thermal and process-variation constraints. • Power-driven clock-tree synthesis. • RTL power estimation and management. • Low-power memory and bus interface synthesis. • Power optimization solutions for LCD/plasma display units. • Regularity-driven synthesis and synthesis for DFM. • Design technologies for low-power secure hardware/crypto-processors.
IMEC (B) Intracom (GR) Stanford (US) USC (US) STM (I - F - CH) Colorado (US) EPFL (CH) Intel(IL) PoliMi (Sciuto) Philips(NL) Siemens(I) UniBo (Benini) Freescale(F - US) UniRm (Olivieri) UniUrb (Bogliolo) Infineon(D - A) Synopsys(US) PoliMi (Sami) UniVr (Fummi) Our research network PoliToEDA Group
Why temperature effects are important in DSM design • Increase in chip power density – has resulted in on-chip temperature gradients. • Reliability - a huge problem in current and future nanometric designs. • Leakage power, which constitutes a major portion of the power consumption in nanometric designs, is exponentially dependent on temperature. • Signal integrity due to temperature gradients in high performance ICs is becoming a major design problem to tackle.
Factors contributing to an increase in temperature • Aggressive interconnect scaling has resulted in higher current densities. • Increase in the number of metal layers. • Low-K dielectrics introduced in current silicon processes have very low thermal conductivity. • Voltage does not scale in the same proportion as the rest of the geometries – higher power density. • Different dynamic voltage scaling and clock-gating techniques have contributed to large temperature gradients on chip.
Thermal-aware design methodologies • Even though thermal modeling has received much attention from the scientific community, very little has been done in circuit design techniques to reduce hazards caused by temperature gradients in the ICs. • Thermal-aware design must be a part of the design flow for future 3D ICs. • Thermal-aware placement, thermal floor-planning, thermal-aware clock distribution are some of the areas that are being explored.
Temperature effects on clock distribution network Thermal gradients – Sources • Power reduction techniques such as dynamic power management, clock-gating, operand isolation etc., induce temperature gradients on the substrate. • With decreasing feature sizes, the global metal layers on which the clock signal is routed are getting closer to the substrate. • Temperature gradients in clock networks may be induced due to self heating or thermal coupling from the substrate or metal layers underneath the clock.
Temperature effects on clock distribution network Effects of temperature gradients • Clock skew induced by temperature gradients is no longer negligible. • Buffer insertion in clock networks have to be revisited to account for temperature effects.
Temperature effects on clock distribution network L • Assumption: Uniform thermal profile • Target: Zero-skew clock tree wire a wire a wire b wire c X = L/2 X = L/2 Clock insertion point
Assumption: Thermal profile linearly increasing towards c Target: Zero-skew clock tree Temperature effects on clock distribution network L wire a wire a wire b wire c X > L/2 X < L/2 Clock insertion point
Thermal resilient clock tree (TRTC) design • Solutions proposed for zero-skew clock trees and simple thermal profiles (e.g., Pan et al., ICCAD-05). Our Approach [DATE-06] • Capable of dealing with non-zero skew clock trees. • Significant reduction in worst case clock skew. • Minimum wirelength penalty. • Improves on a BST-DME based algorithm. • Robust to process variations as it is minimally intrusive and maintains the goodness of the original tree. • Starting point is an existing clock tree.
Thermal Profile 1 X>L/2 X=L/2 X>L/2 Thermal Profile 2 Fall of TRCT for varying temperature profile
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Address the issue of non-stationary temperature distribution. • Exploit the buffers introduced during clock tree generation - Transform them into tunable delay elements. • Compensate for temperature induced delay variations by tuning the buffers accordingly.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Current high performance microprocessors have sensors placed at potential hotspots – monitored by a TMU (thermal management unit) • Our technique borrows from online, post silicon tuning where TDBs are used to reduce process variations-induced timing violations. • As clock violations in our case stem from a deterministic source (temperature), our proposed methodology allows for finer control in solution quality in terms of: - Number of tunable buffers. - Range of tunable buffers.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • A TDB, in its simplest form, consists of a number of taps providing fixed delays that, when activated, keep skew within bounds. • Our TDBs consist of a pair of inverters with capacitive loads between them that meet our delay requirements after proper sizing.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • The sizing of the inverters is done to ensure that the delay of the buffer (from library) being replaced and the TDB with all taps OFF is the same. • Each ON tap delivers a constant delay of 8ps – a value arrived at to keep the area and power overheads within admissible range while achieving substantial compensation. • Various TDBs corresponding to existing buffers in the library are built and characterized for area, timing and power.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Three phase thermal compensation flow • Build initial buffered clock tree with a given skew bound assuming a fixed thermal profile. • Calculate optimal number of TDBs and relative tuning taking into account every thermal variation that would occur during the lifetime of the design. • Embed TMU and TDBs.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) Example: Outcome of the proposed flow for a clock tree with m TDBs along with the tuning table that stores, for each thermal transition, the tuning of the m buffers.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Notations and Problem formulation - With the skew of a tree TR defined as the maximum difference between the delay of any two sinks • the problem is formulated as: • Given a tree TR with skew bound B, having m buffers and a temperature profile T, • find optimal assignments of the tuning R = {ri}, i = 1, ..., m such that - Overhead (power and area) of a TDB is proportional to the tunable range – the optimal assignment of ri implies minimizing the overall magnitude of ri • - The problem is cast into an ILP formulation
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Extract clock tree geometry from DEF file (after Place, CTGEN, global route). • Extract nominal delay from SDF parasitics file. • Read in thermal profiles – calculate insertion delays to each sink and generate constraints for ILP solver. • Embed TMU, TDBs in place of select buffers in the DEF file - use incremental placement option to PLACE and then re-route the design.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) Experimental Setup • Benchmarks obtained from opencores.org • Selected according to varying degrees of complexity and size • Single clock domain
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Temperature induced violations nullified using dynamic delay tuning. • TDBs easy to design • Average power penalty of around 5% • Number of TDBs and tuning range minimized using ILP • Area increase insignificant Details available in paper to appear at ISLPED-06.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Implementation of TMU • Two styles: Centralized and distributed standard cell block. • Impact on design: Area, power, wire-length. • TMU: • N temperature sensors, each of which can sense up to p temperature values. • i-th TDB has ri taps (inputs are one-hot encoded). • TMU consists of N log p inputs and Σri outputs
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) • Implementation details • Assumptions: • 3 sensors. • 4 ranges of temperatures for each TDB. • Physical design: • 90nm HCMOS process with standard Vth from STMicroelectronics. • Synthesis flow: Synopsys front-end and Cadence back-end.
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) Results Centralized implementation
Dynamic thermal clock skew compensation using Tunable Delay Buffers (TDBs) Results Distributed implementation Slightly less area and power efficient, better in terms of wire-length (flexibility in the placement).
Conclusions • Temperature effects can no longer be considered trivial and hence design techniques and accurate modeling are absolutely necessary to build robust chips. • Clock trees are even more susceptible to temperature gradients across the chip as they span the entire die, making thermal aware clock-tree construction an absolute necessity. • We have proposed a technique that dynamically compensates temperature induced clock skew using TDBs. • We have considered the issue of TMU design and implementation.
Contacting the Speaker Prof. Enrico Macii Politecnico di TorinoDipartimento di Automatica e InformaticaCorso Duca degli Abruzzi 2410129 Torino, Italy Phone: +39-011-564.7074Mobile: +39-347-067.5850Fax: +39-011-564.7099E-mail: enrico.macii@polito.itURL: http://eda.polito.it/enrico.html