1.05k likes | 1.41k Views
Timing Closure Today. Lou Scheffer Cadence San Jose, CA Lou@cadence.com. Timing Closure Today. Design Entry. Timing more accurate as flow progresses Sometimes an earlier stage thinks timing is OK, but it fails a later stage Need to repeat one or more steps with tighter constraints
E N D
Timing Closure Today Lou Scheffer Cadence San Jose, CA Lou@cadence.com Lou Scheffer
Timing Closure Today Design Entry • Timing more accurate as flow progresses • Sometimes an earlier stage thinks timing is OK, but it fails a later stage • Need to repeat one or more steps with tighter constraints • We have a timing closure problem when this process fails. Symptoms include: • Non-convergence • Too many iterations • Solution achievable, but this flow cannot find it. Synthesis Timing Place Timing Route Timing Lou Scheffer
The Timing Closure Problem Lou Scheffer
Examples of Problems Lou Scheffer
Agenda • Timing Analysis Overview • Traditional design flows • Summary of DSM Problems • Correction Methods Overview • Hierarchy and Timing Closure • Block Level Timing Closure • Experimental Results • Summary Lou Scheffer
Timing Analysis • Give accurate time values on each pin/port of the network • Has to deal with design changes in optimization toolbox • Static Timing Analysis • Simulation far too slow in optimization environment • Accuracy is more than enough Lou Scheffer
Timing Analysis Requirements • Choose combination of timing analyzer and delay calculator which are appropriate for level of design • give the best accuracy • for performance that can be tolerated • Timing Analysis / Delay calculation must be able to cope with logic design changes • Incremental • Highest performance possible • Non-linear delay models Lou Scheffer
Timing Analysis Requirements • Must handle… • Difference between rising and falling delays • Delay dependent on slew rate • Slew and delay dependent on output load • Non-linear delay equations Lou Scheffer
Late Mode Analysis Definitions a y x b c • Constraints: assertions at the boundaries • Arrival times: ATa, ATb • Required arrival time: RATx • Delay from a to x is the longest time it takes to propagate a signal from a to x • Slack is required arrival time - arrival time. Lou Scheffer
Example a y 1 1 x b c Lou Scheffer
Early mode analysis • Definitions change as follows • longest becomes shortest • slack = arrival – required • Not as important since early violations are easier to fix a y 1 1 x b c Lou Scheffer
a d o x b cl Delay modeling PropagationArcs TestArc Timing Model Lou Scheffer
Agenda • Timing Analysis Overview • Traditional design flows • Summary of DSM Problems • Timing Correction Overview • Approaches to Fixing Timing Closure • Experimental Results • Summary Lou Scheffer
Tech independent optimization • Tech mapping • Rudimentary timing correction Traditional Design Flows Design Entry Mid 1980's Synthesis Timing Place Timing Route Timing Lou Scheffer
Logic Synthesis • Technology independent optimization • General goal: reduce connections, literals, redundancies, area • Technology mapping • Map logic into technology library • Timing correction added next • Find and fix critical timing paths • Fix electrical violations (load, slew) Lou Scheffer
Tech independent optimization • Tech mapping • Timing correction Traditional Design Flows Design Entry 1990's Synthesis w/Timing Place w/Timing Route Timing Integrate timing with synthesis and placement Lou Scheffer
Tech independent optimization • Tech mapping • Placement • Timing Correction Traditional Design Flows Design Entry 2000's Synthesis/Placement w/Timing Global Route Detailed Route Timing Integrate timing with synthesis and placement Lou Scheffer
Tech independent optimization • Tech mapping • Placement • Timing Correction • Global route Traditional Design Flows Design Entry 2001 Synthesis and Placement w/Timing and Global route Detailed Route Timing Integrate timing with synthesis, placement and global route Lou Scheffer
Agenda • Timing Analysis Overview • Traditional design flows • Summary of DSM Problems • Correction Methods Overview • Hierarchy and Timing Closure • Block Level Timing Closure • Experimental Results • Summary Lou Scheffer
The Wall • Logic designers concentrate on logic and timing (as understood by synthesis) • Design work done in abstract world • Was gates and wire load models • Now may include placement and global route • Throw design over the wall when complete • Physical designers concentrate on layout and ability to route • Effective method for many years Lou Scheffer
General CMOS Problems • Low drive strengths / low power • Capacitance (not intrinsic delay) plays a large role in performance • Huge variability – range between slowest possible and fastest possible • Noise affects delay • IR drop a big percentage of supply • Crosstalk can change delay by a factor of 2 Lou Scheffer
Additional DSM Problems • High density / huge designs • Very thin and resistive wires • Very high frequencies • Inductance becomes more important • Smaller voltages • IR drop a bigger fraction of signal swing • Clock skew and latency • Electromigration and noise Lou Scheffer
Clock Distribution Problems • Most common design approach requires close to zero skew • CMOS / DSM problems all affect clocks • Distribution problem increasing • Number of latches/flip-flops growing significantly • Power consumed in clock tree significant • I and noise also of concern Lou Scheffer
Process Designers are trying to help • Many metal layers • Different metal pitches • Small pitch for local interconnect • Big pitch/thick metal for long, fast wires • Copper wires, thick metal to lower R • SOI – Silicon On Insulator • Low k dielectrics • These help but are not enough Lou Scheffer
Agenda • Timing Analysis Overview • Traditional design flows • Summary of DSM Problems • Correction Methods Overview • Hierarchy and Timing Closure • Block Level Timing Closure • Experimental Results • Summary Lou Scheffer
Timing Correction • Fix electrical violations (slew and load). Takes priority since needed for reliability. • Resize cells • Buffer nets • Copy (clone) cells • Fix timing problems • Local transforms (bag of tricks) • Path-based transforms Lou Scheffer
Local Transforms • Resize cells • Buffer or clone to reduce load on critical nets • Decompose large cells • Swap connections on commutative pins or among equivalent nets • Move critical signals forward • Pad early paths • Area recovery Lou Scheffer
Transform Example ….. Double Inverter Removal ….. ….. Delay = 4 Delay = 2 Lou Scheffer
0.2 d a ? e 0.2 b f 0.3 a A b 0.035 a C b 0.026 Resizing Lou Scheffer
0.2 d d A e e 0.2 a f f 0.2 ? a b g g 0.2 B b h h 0.2 Cloning Can also isolate critical sinks Lou Scheffer
0.2 d 0.2 d e 0.2 e 0.2 a a f f B 0.2 0.2 ? b B b g g 0.2 0.2 0.1 h 0.2 h 0.2 Buffering Lou Scheffer
Arr(a)=4 a 1 b Arr(b)=3 e 1 Arr(c)=1 c Arr(e)=6 1 d Arr(d)=0 e a 1 b 1 Arr(e)=5 c 1 d Redesign Fan-in Tree Lou Scheffer
3 3 1 1 1 1 1 1 1 1 2 1 1 Longest Path = 4 Slowdown of buffer due to load Longest Path = 5 Redesign Fan-out Tree Lou Scheffer
Decomposition Lou Scheffer
1 0 a 1 1 5 b 1 2 c 2 1 2 3 c 1 1 b 1 0 a Swap Commutative Pins Simple Sorting on arrival times and delay works 2 Lou Scheffer
a b e d c Move Critical Signals Forward a b c e d • Based on ATPG • linear in circuit size • Detects redundancies efficiently • Efficiently find wires to be added and remove. • Based on mandatory assignments. Lou Scheffer
Path-based Transforms • Path-based resizing • Unmap / remap a path or cone • Slack stealing • Retiming Lou Scheffer
C1 C2 0 1 2 Slack = +1 Slack = -1 C1 C2 C1 C2 Slack Stealing • Take advantage of timing behavior of level sensitive registers (latches) No change to logic! Slack = 0 Lou Scheffer
Backward Delay=3 Forward Delay=2 Retiming Problem: Verification A more aggressive optimization since it changes the function Lou Scheffer
Solutions to Timing Closure • Carry hierarchical logic design into physical • Hand / Custom design • Improved analysis • More sophisticated clock design • Modify existing flows • More physically knowledgeable tools • Many variations: combined synthesis/place/route, gain based synthesis, etc. Lou Scheffer
Agenda • Analysis Methods Overview • Traditional design flows • Summary of DSM Problems • Correction Methods Overview • Hierarchy and Timing Closure • Block Level Timing Closure • Experimental Results • Summary Lou Scheffer
Hierarchy and Physical Design • Logical hierarchy can be carried over into physical design • Seems natural top-down approach, using floorplanning as a firm guide to physical design • Use of hierarchy offers many advantages and many possible problems • A new generation of tools for this problem Lou Scheffer
Block 1 L Block 2 L Block 3 L Pin Assignment and Timing Budgeting Each block requires: • Content definition • Partitioning • Pin locations • Clock/timing definition • Set_input_delay • Set_output_delay • Set_drive • Set_load • Path exceptions (false, multicycle paths) Lou Scheffer
Hierarchy and Physical DesignAdvantages… • Run time of P&R tools • Blocks can be built independently • Early (and valuable) knowledge of global wires • Limited wire delay within macro may allows simpler methodologies • Contains the problem size • Extends naturally to SOC and mixed A/D chips • May be the only real method available Lou Scheffer
Physical Hierarchy Disadvantages • Possible to overconstrain the design in many ways (see next slide) • Hierarchy usually logic-based, not physically-based • Designed for logical correctness, not physical implementation Lou Scheffer
Physical Hierarchy Overconstraints • Placement solution perhaps overconstrained • Logical gates may not fit naturally in a rectangle • Ability to find a routable solution hindered • Can’t detour through neighboring cell • Boundary conditions explode and must be managed carefully to avoid surprises • A recent IBM design had 17,000 top level connections. A bad timing constraint on any one can make the whole design infeasible Lou Scheffer
Hierarchy Example Plots Lou Scheffer
Hierarchy Example Plots Lou Scheffer
Hierarchy Example Plots Lou Scheffer
The Challenges • How to derive sensible partitioning? • How to achieve die utilization similar to “flat” approach? • How to achieve clock speed and skews similar to “flat” approach? • How to automatically generate optimal pin assignments for each module? • How to automatically come up with realistic timing budgets for each module? Lou Scheffer