800 likes | 965 Views
Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikoli ć. Timing Issues. January 2003. Synchronous Timing. Timing Definitions. Latch Parameters. D. Q. Clk. T. Clk. PW m. t su. D. t hold. t d-q. t c-q. Q.
E N D
Digital Integrated CircuitsA Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić Timing Issues January 2003
Latch Parameters D Q Clk T Clk PWm tsu D thold td-q tc-q Q Delays can be different for rising and falling data transitions
Register Parameters D Q Clk T Clk thold D tsu tc-q Q Delays can be different for rising and falling data transitions
Clock Uncertainties Sources of clock uncertainty
Clock Nonidealities • Clock skew • Spatial variation in temporally equivalent clock edges; deterministic + random, tSK • Clock jitter • Temporal variations in consecutive edges of the clock signal; modulation + random noise • Cycle-to-cycle (short-term) tJS • Long term tJL • Variation of the pulse width • Important for level sensitive clocking
Clock Skew and Jitter Clk • Both skew and jitter affect the effective cycle time • Only skew affects the race margin tSK Clk tJS
Clock Skew # of registers Earliest occurrenceof Clk edge Nominal – /2 Latest occurrenceof Clk edge Nominal + /2 Clk delay Insertion delay Max Clk skew
Positive Skew Launching edge arrives before the receiving edge
Negative Skew Receiving edge arrives before the launching edge
Timing Constraints Minimum cycle time: T - = tc-q + tsu + tlogic Worst case is when receiving edge arrives early (positive )
Timing Constraints Hold time constraint: t(c-q, cd) + t(logic, cd) > thold + Worst case is when receiving edge arrives lateRace between data and clock
Longest Logic Path in Edge-Triggered Systems TJI + d TSU Clk TClk-Q TLM T Latest point of launching Earliest arrivalof next cycle
Clock Constraints in Edge-Triggered Systems If launching edge is late and receiving edge is early, the data will not be too late if: Tc-q + TLM + TSU < T – TJI,1 – TJI,2 - d Minimum cycle time is determined by the maximum delays through the logic Tc-q + TLM + TSU + d + 2 TJI < T Skew can be either positive or negative
Shortest Path Earliest point of launching Clk TClk-Q TLm Clk TH Data must not arrivebefore this time Nominalclock edge
Clock Constraints in Edge-Triggered Systems If launching edge is early and receiving edge is late: Tc-q + TLM – TJI,1 < TH + TJI,2 + d Minimum logic delay Tc-q + TLM < TH + 2TJI+ d
Flip-Flop – Based Timing Skew Flip-flop delay Logic delay f TSU TClk-Q Flip-flop f = 1 f = 0 Logic Representation after M. Horowitz, VLSI Circuits 1996.
Flip-Flops and Dynamic Logic Logic delay TSU TSU TClk-Q TClk-Q f = 1 f = 0 f = 1 f = 0 Logic delay Precharge Evaluate Precharge Evaluate Flip-flops are used only with static logic
Latch timing When data arrives to transparent latch tD-Q Latch is a ‘soft’ barrier D Q Clk tClk-Q When data arrives to closed latch Data has to be ‘re-launched’
Single-Phase Clock with Latches f Latch Logic Tskl Tskl Tskt Tskt Clk PW P
Latch-Based Design L1 latch is transparentwhen f = 0 L2 latch is transparent when f = 1 f L1 L2 Logic Latch Latch Logic
Latch-Based Timing Skew Static logic f L2 latch f = 1 L1 L2 Logic Latch Latch L1 latch f = 0 Logic Long path Can tolerate skew! Short path
Clock Distribution H-tree Clock is distributed in a tree-like fashion
More realistic H-tree [Restle98]
The Grid System • No rc-matching • Large power
final drivers pre-driver 21164 Clocking tcycle= 3.3ns • 2 phase single wire clock, distributed globally • 2 distributed driver channels • Reduced RC delay/skew • Improved thermal distribution • 3.75nF clock load • 58 cm final driver width • Local inverters for latching • Conditional clocks in caches to reduce power • More complex race checking • Device variation tskew = 150ps trise = 0.35ns Clock waveform Location of clock driver on die
tcycle= 1.67ns trise = 0.35ns tskew = 50ps EV6 (Alpha 21264) Clocking 600 MHz – 0.35 micron CMOS • 2 Phase, with multiple conditional buffered clocks • 2.8 nF clock load • 40 cm final driver width • Local clocks can be gated “off” to save power • Reduced load/skew • Reduced thermal issues • Multiple clocks complicate race checking Global clock waveform
ps 5 10 15 20 25 30 35 40 45 50 ps 300 305 310 315 320 325 330 335 340 345 EV6 Clock Results GCLK Skew (at Vdd/2 Crossings) GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%)
EV7 Clock Hierarchy Active Skew Management and Multiple Clock Domains + widely dispersed drivers + DLLs compensate static and low-frequency variation + divides design and verification effort - DLL design and verification is added work + tailored clocks
Self-timed and Asynchronous Design Functions of clock in synchronous design 1) Acts as completion signal 2) Ensures the correct ordering of events Truly asynchronous design 1) Completion is ensured by careful timing analysis 2) Ordering of events is implicit in logic Self-timed design 1) Completion ensured by completion signal 2) Ordering imposed by handshaking protocol
Completion Signal in DCVSL V V DD DD B 0 Start Done B 1 B 0 B 1 In 1 In 1 PDN PDN In 2 In 2 Start
Hand-Shaking Protocol Two Phase Handshake
2-Phase Handshake Protocol Advantage : FAST - minimal # of signaling events (important for global interconnect) Disadvantage : edge - sensitive, has state
Example: Self-timed FIFO All 1s or 0s -> pipeline empty Alternating 1s and 0s -> pipeline full