200 likes | 316 Views
A Timed-Automaton Based Method for Accurate Computation of Delay in the Presence of Cross-Talk. Serdar Tasiran, Sunil P. Khatri, Sergio Yovine, Robert K. Brayton, Alberto L. Sangiovanni-Vincentelli Department of Electrical Engineering & Computer Sciences
E N D
A Timed-Automaton Based Method for Accurate Computation of Delay in the Presence of Cross-Talk Serdar Tasiran, Sunil P. Khatri, Sergio Yovine, Robert K. Brayton, Alberto L. Sangiovanni-Vincentelli Department of Electrical Engineering & Computer Sciences University of California, Berkeley
Overview Problem: Computing the delay of a combinational circuit. OUTLINE • Why a new method? • Timed automata • Input waveforms • Gate delay models • Cross-talk models • Computing delay with timed automata • How to fight computational complexity: • A conjunctively-decomposed representation • Conservative delay computation • Experimental results • Future research
Why a new method? • Before deep-submicron, a “solved problem” • Devadas, Keutzer, Malik ‘93 • McGeer, Saldanha, Brayton, Sangiovanni ‘’93 • Lam, Brayton ‘94 • Higher clock speeds • Fewer levels of logic • Greater timing accuracy required • Increased effect of parasitics: cross-talk (coupling) • New process technologies, circuit families, dynamic logic, complex gates • Conventional gate delay models no longer adequate • Must model new effects at circuit level • Boolean behavior and timing very interdependent • Delay depends on relative arrival times and values of inputs
From ICCAD ‘97 tutorial on timing analysis. (Devgan, et. al.) What about topological analysis or simulation? • Topological delay does not account for cross-talk. • Assuming worst case cross-talk on all wires is too conservative. • Only transistor-level simulation provides desired accuracy • BUT: Number of possible input patterns exponential in # of inputs: • For large circuits, infeasible to simulate all patterns. • Delay not guaranteed unless all patterns are simulated.
OUR APPROACH • Timed automata serve as delay models for circuit components • Delay parameters obtained by • Simulation • Analytical methods • Formal timing verification used to compute delay • All patterns covered; delay guaranteed. From ICCAD ‘97 tutorial on timing analysis. (Devgan, et. al.)
i = 1, x Ü 0 o i 2 £x £ 3 o = 0 o = 0 o = 1 x £ 3 2 £ delay £ 3 i =0, x £ 3 Initial Timed Automata • Clocks (timers): real-valued variables, increase at same rate. • For each location • an output assignment • aninvariant: a clock predicate. • Clock predicate: Positive Boolean combination of x £ d and x ³d. i reset x o 2 3
a o b o =0 b a T1 T2 a T3 b T4 Timed Automata as Delay Models • Example:NAND gate • Determine delay parameters using SPICE simulation • Construct timed automaton model with these parameters. o =0 a = b = 0, x Ü 0 a = 0, b =1 a = 1, b = 0 drise,min £ x or x Ü 0 a = b = 0 o =0 o =0 x £ drise,max x £ d1fall,max x £ d2fall,max d1fall,min £ x d2fall,min £ x o =1
Timed Automata as Delay Models • Delay of this gate depends on • Old and new values of a, b, c, d, e • Relative arrival times of a, b, c, d, e • Modeling this circuit with [dmin, dmax]is too coarse. • Delay models with state are more powerful • Timed automata can express sophisticated delay models • SPICE-simulate an individual circuit component exhaustively • Capture delay information into a timed automaton. • Desired amount of detail can be incorporated into delay model • Allows complexity-accuracy trade-off
S W W H T same Modeling Cross-talk • As feature sizes shrink • Wire delays become dominant • W and S shrink linearly • T shrinks sub-linearly • Wire-to-wire capacitance becomesmore significant. • Transitions on wires affect the delays of neighboring wires • Timed automaton model obtained by • Extraction of parasitics from layout • SPICE simulation for various input patterns • Simple cross-talk model • One wire switches • Wires switch in the same direction • Wires switch in the opposite direction stable dopposite,min £ x dsame,min £ x x Ü 0 x Ü 0 done,min £ x x Ü 0 opposite one switch x £ done,max x £ dopposite,max x £ dsame,max
For each input signal i = iold i = iold i = arbitrary i = iold i = inew i = inew i = inew clock = high x Ü 0 Different arrival times Asynchronous input For each input signal x = arrivei dmin£ x £ dmax i = inew clock = high x Ü 0 Representing Sets of Input Waveforms • Two-vector delay: All inputs areinitially stable and then switchsimultaneously. • Floating-mode:
Delay Computation with Timed Automata GIVEN • Set of primary input waveforms. • Represented by timed automaton I. • A combinational circuit • Described as an interconnection of components G1, G2, …, Gk MUST EXPLORE THE STATE SPACE OF • Automaton representing primary output waveforms • F = ($ primary inputs, internal variables)( I || G1 || G2 || ... || Gk ) COMPUTE • Earliest and latest time each output of F changes its value
Exploiting the Structure of the Problem OBSERVATIONS: • State space has no cycles: otherwise circuit oscillates • Depth of state space limited by longest topological path: linear in circuit size • S(k) : Set of states that system can be in after k transitions. • Need to store S(k) only: Savings in space • May revisit states: Trading off time for memory • The representation for S(k) can be kept in conjunctively decomposed form. S(0) S(2) S(4) S(1) S(3) S(5)
s3 s2 s1 s4 x3 x2 xi x4 Conjunctively Decomposed Representations • Represent S(k) = ÙiSi(k)where • Si(k)(si,xi ,si-1,xi-1) represents (si,xi) as a function of (si-1,xi-1) • Compute Si(k,k+1) separately for each i, based on Si-1(k,k+1) only:Si(k,k+1) = Si-1(k,k+1) Ù Si(k)Ù Ti • Support of each partition kept small: Smaller BDDs. MORE OBSERVATIONS: • Circuit components have bounded memory • State of component is correlated with components in its vicinity. • Partition circuit into slices so that at step (k)possible values of (si,xi) determined uniquely by (si-1,xi-1)
Implementation • Timed-automaton-based delay computation algorithm implemented inside MOCHA. • BDD based implementation • Circuit is partitioned into slices • Decomposed representation of state sets • Reached state computation is performed on a per-partition basis. • Case study: n-bit carry skip adder
Case Study • Potential cross-talk • Doesn’t actually occur, becausec_out and A3 are separated in time • Algorithm must be cross-talkaware not to overestimate delay
Experimental Results (2) • Compare: Monolithic representation can not complete the 4-bit example in 1GB.
Advantages of Approach • Modeling issues and verification and analysis issues are decoupled. • Timed automata serve as clean interface between the two. • The same algorithms remain applicable • For different delay models • At different levels of the hierarchy • Efficiency can be traded-off for accuracy without modifying analysis algorithm. • Exact characterization of delay computation problem • Allows sound conservative simplifications. • Timing properties other than delay can be verified • Hold and set-up times • For dynamic logic, is the input pulse wide enough to discharge output? • Is there a channel-connected path from supply to ground?
Status and Future Work • Timed-automaton-based delay computation algorithm implemented inside MOCHA. • BDD based implementation • Circuit is partitioned into slices • Decomposed representation of state sets • Reached state computation is performed on a per-partition basis. • Best performance so far: • 32-bit carry skip adder • 3 hours, ~80MB • FUTURE WORK: Exploit hierarchy