230 likes | 245 Views
Study focusing on optimizing gate sizing and post-silicon tunability allocation to address process variations in sub-90nm technologies. Techniques, challenges, and two-stage stochastic programming formulation are discussed and experimentally validated.
E N D
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of Electrical and Computer Engineering University of Maryland College Park http://www.ece.umd.edu/~vishalk
Introduction • Process variations cause significant spread in design performance in sub 90nm technologies • Impact yield and reliability • It is necessary to explicitly consider the impact of process variations on design parameters • Several statistical analysis and optimization techniques have been proposed to improve timing/power yields
Handling Process Variations Process Variations • Statistical Gate Sizing • Statistical Buffer Insertion Design-Time Optimization Post-Fabrication Tunability • Post-Silicon Tunable Clock-Tree Buffers • Adaptive Body-Biasing • [Davoodi, DAC’06] [Sapatnekar, DAC’05] • [Zhou, ICCAD’05] • [Chen, ICCAD’05] • [Mahoney, ISSC’05] • [Takahashi, 2003] • [Tam, JSSC’00] • [He, ISPD’06] • [Davoodi, ICCD’05] • [Wong, ICCAD’05] • [Khandelwal, ICCAD’03] • [Kim, ISLPED’03] • [Orshansky, ICCAD’06]
Tcons tj n 0 ti i di Minimize Area, Power, … Traditional Gate Sizing • Gate size: si • Minimize area, or power • Subject to: • meeting a delay constraint at the output • size constraints [Fishburn, Dunlop 1985] [Sapatnekar,1993]
Minimize Area, Power, … Minimize Area, Power, … j i Convex Formulation Posynomial Gate Delay Expression [Fishburn, Dunlop 1985] [Sapatnekar,1993] Traditional Gate Sizing
T ox + + n n Leff Effects of Process Variations Set of random variables with arbitrary distributions • Delay of each gate becomes a random variable • Statistical Gate Sizing • [Davoodi, DAC’06] [Sapatnekar, DAC’05] • [Zhou, ICCAD’05]
B1 B2 B3 B4 B5 B6 B7 FF 1 FF 2 FF 3 FF 4 FF 5 FF 6 FF 7 FF 8 Post-Silicon Tunable (PST) Clock Tree Buffers • Tunable clock buffers can introduce extra slack into critical paths after fabrication • Design Overhead • Area, Clock-Tree Power • [Chen, ICCAD’05] • [Mahoney, ISSC’05] • [Takahashi, 2003] • [Tam, JSSC’00]
B1 B2 B3 B4 B5 B6 B7 FF 1 FF 2 FF 3 FF 4 FF 5 FF 6 FF 7 FF 8 Post-Silicon Tunable Clock Tree Buffers • Let Dij be the delay of the longest path between flip-flops i and j • Consider Flip-Flops 2 and 7: Tune buffers to change clock-skew
Optimization Objective: Tunability Cost • Metric to capture the overhead due to PST buffers in the design • Silicon Area • Clock-Tree Power
Loss Delay (t) Tcons • Convex loss function Q(.) Optimization Objective: Binning Yield Loss (BYL) [V. Zolotov, DAC’04] [D. Blaauw, GLSVLSI’05]
B1 B2 B3 B4 B5 B6 B7 FF 1 FF 2 FF 3 FF 4 FF 5 FF 6 FF 7 FF 8 Problem Statement Given a sequential design with a synthesized PST clock-tree (known buffer locations), perform simultaneous • Statistical gate sizing • PST buffer tuning range determination Such that Binning Yield Loss and Tunability Cost is minimized Tcons n 0 i di
First Stage • Deterministic constraints: • meeting timing requirement assuming no variations • Capturing variability in objective Two-Stage Formulation • Gate Size: , Tuning Buffer Range:
Given a solution to the first stage problem and a variability sample: Second Stage Loss Q Tcons Second Stage Formulation • No Statistical Timing Analysis scheme exists to estimate the timing distribution of a circuit given gate sizes and tuning buffer ranges • Each sample of variability requires different amount of tuning for maximum timing yield
Need to show each sample is convex Convex Problem THEOREM:The proposed two-stage stochastic programming formulation is convex PROOF:Detailed proof omitted for brevity • First stage constraints are convex • First stage objective is convex if BYL(x,r) is convex • From second stage formulation one can show that is convex
Kelley’s Cutting Plane Algorithm • Iteratively solve first and second stage formulation • Given a solution to the first stage formulation, we use method of finite differences to generate a lower bound to BYL from the second stage formulation Add this constraint to the first stage formulation at each iteration
Shortest-Path Constraints • Inherently non-convex in nature • Approximate gate delay using a linear approximation (lower bound) • The two-stage stochastic programming formulation can be modified to consider shortest path constraints
Experimental Results • Implemented the framework in SIS using MOSEK to solve the convex formulation • Used CAPO to place netlist to get spatially correlated gate delays • Assumed 15% Vth variation in 90nm technology node [Predictive Technology Model] • Synthesized the PST clock-tree using the technique proposed in [Chen et. al, ICCAD’05] j yj i yi xj xi
Experimental Results • Experimental Comparison – ISCAS benchmarks • [Chen]: • Nominal gate sizing • PST clock-tree generation using [Chen et. al, ICCAD’05] • Sensitivity: • Retain PST clock-tree location and range • Sensitivity-driven statistical gate sizing algorithm • Size the gate with maximum yield gain greedily (iterative) • Similar in spirit to [Zhou ICCAD’05, Zolotov DAC’05] • Stochastic: • Retain PST clock-tree buffer locations • Proposed simultaneous gate sizing and post-silicon tunability allocation algorithm
Runtime Comparison Number of Iterations
Summary and Future Work • Variability-driven framework for simultaneous gate sizing and post-silicon tunability allocation to minimize binning-yield loss and tunability cost • Efficient stochastic programming based scheme to solve the formulation • No assumptions about parameter distribution or their correlations • Need to develop a statistical timing analysis scheme that can consider the effect of post-silicon tunability