280 likes | 432 Views
On Timing Closure: Buffer Insertion for Hold-Violation Removal. Pei-Ci Wu Martin D. F. Wong. DAC’14. Outline. Introduction Preliminaries Linear Programming Based Optimization Bottom-up Buffer Insertion Experimental Results Concluding Remarks. Introduction.
E N D
On Timing Closure: Buffer Insertion for Hold-ViolationRemoval Pei-Ci Wu Martin D. F. Wong DAC’14
Outline • Introduction • Preliminaries • Linear Programming Based Optimization • Bottom-up Buffer Insertion • Experimental Results • Concluding Remarks
Introduction • Timing closure, which is to satisfy the timing constraints, is a key problem in the physical design • Setup (long-path) constraints • ensure that the signal transitions do not arrive too late • hold-time (short-path) constraints • ensure that the signal transitions do not arrive too early
Typically, hold violations are addressed after setup optimization has been performed. • Discrete cell sizes (i.e. discrete buffer sizes for hold optimization)in modern industrial designs
Cell libraries specified for the setup constraints and the hold-time constraints are usually different in modern industrial designs
Preliminaries • Negative setup slacks and negative hold slacks indicate setup violations and hold violations
TNS • the absolute value of the total negative setup slacks of all the pins in PO • THS • the absolute value of the total negative hold slacks of all the pins in PO • TNS must not be worsen during hold-violation removal
Given: • a design and a buffer library, • find a buffering solution such that: • THS and the cost of buffering (i.e. area and power consumption) are both minimized while TNS is not worsen.
Linear Programming Based Optimization • Inserting delay into wires to remove hold violations • A linear programming formulation • Extend such formulation for the complex timing constraints • Graph-reduction approach
Input • Combinational circuit C*s.t. for any pin p of C*, hold_slackp < 0 and setup_slackp > 0 • C*can then be represented as a directed acyclic graph G(V,E) • V is the pins of C* • (i, j) ∈ E represents an edge
I: the zero in-degree pins • O : the zero out-degree pins • for each pin i in V • three real-value variables, xi(delays inserted at pin i for hold-time constraints), hai, and sai
For buffer library characterization is necessary in order to get an empirical ratio such that • we assume that the buffer only affects the driver cell and the sink cells of the buffer
Delays introduced by inserting the buffer is • (a) • (b)
Objective : • The setup constraints limit the delays that can be inserted • riis only necessary when there is no feasible solution
Some pins with positive setup slacks and positive hold slacks that are not included
Bottom-up Buffer Insertion • Given: • a pin i, hold delay DH and setup delay DS • Find a buffering solution at pin ifrom a buffer library B: • hold delays introduced by the chosen buffers are as close to DH • setup delays introduced by the chosen buffers are not larger than DS • Minimize the area of the chosen buffers
DP based algorithm • A set of buffering candidates C(L, dh, ds, A) is kept during the process • For each buffer in B, we insert it to any of the existing candidates
New buffering candidates • (1) if d′s>DS, C′ is removed immediately • (2) if d′h<=dh, C′ is removed as well • d′h > DH +margin where margin is a parameter, then C′ is removed too • (3) C′ is dominated by any existing candidate C*(I*, d*h, d*s, A*) if d′h < d*hand A′ > A* • Chose the candidate that has the largest ratio of dh/A as the buffering solution
Bottom-up Methodology • process the pins by thebottom-up topological ordering (i.e. from PO to PI) • DP algorithm cannot realize the exact amount of hold delays/setup delays by inserting buffers(extra delays)
Suppose now we are processing pin p, collected extra delays • cur_setup_reqp = setup_reqp− ds_delay • extra delays = cur_setup_reqp– sap • Ds = xp + cur_setup_reqp− sap • Similarly to get Dh
Concluding Remarks • First propose a linear programming basedapproach that minimizes the number of inserted delays • Abottom-up buffer insertion and the flow of optimizing are presented