540 likes | 793 Views
Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikoli ć. Timing Issues. January 2003. Synchronous Timing. Clock Uncertainties. Sources of clock uncertainty. Clock Nonidealities. Clock skew
E N D
Digital Integrated CircuitsA Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić Timing Issues January 2003
Clock Uncertainties Sources of clock uncertainty
Clock Nonidealities • Clock skew • Spatial variation in temporally equivalent clock edges; deterministic + random, tSK • Clock jitter • Temporal variations in consecutive edges of the clock signal; modulation + random noise • Cycle-to-cycle (short-term) tJS • Long term tJL • Variation of the pulse width • Important for level sensitive clocking
Clock Skew and Jitter Clk • Both skew and jitter affect the effective cycle time • Only skew affects the race margin tSK Clk tJS
Positive Skew Launching edge arrives before the receiving edge
Negative Skew Receiving edge arrives before the launching edge
ᵟ=-ve ᵟ=-ve
Absolute jitter tjitter Cycle-to-cycle jitter Tjitter Impact of Jitter
Clock Distribution • - Clock gating • - Clock conditioning • Balanced paths trees • - RC delay H-tree Clock is distributed in a tree-like fashion
More realistic H-tree [Restle98]
No rc-matching • Large power • (excess interconnect) The Grid System • -Grids are typically used in the final stage of clock network to distribute the clock to the clocking element loads. • -Delay from the final driver • to each load is not matched (fundamental difference with RC matched CDN) • -Allows late design changes, since CLK is easily accessible on die. • -Large power dissipation since it has lot of unncessary interconnect.
Example: DEC Alpha 21164 (single phase clock on dynamic logic) (width of final driver inverter)
21264 Clocking Clock hierarchy
tcycle= 1.67ns trise = 0.35ns tskew = 50ps EV6 (Alpha 21264) Clocking 600 MHz – 0.35 micron CMOS • 2 Phase, with multiple conditional buffered clocks • 2.8 nF clock load • 40 cm final driver width • Local clocks can be gated “off” to save power • Reduced load/skew • Reduced thermal issues • Multiple clocks complicate race checking Global clock waveform
Self-timed and Asynchronous Design Functions of clock in synchronous design 1) Acts as completion signal 2) Ensures the correct ordering of events Truly asynchronous design 1) Completion is ensured by careful timing analysis 2) Ordering of events is implicit in logic Self-timed design 1) Completion ensured by completion signal 2) Ordering imposed by handshaking protocol
Latch-based clocking - Latch based design in which combinational logic is separated by transparent latches (vicver) - If a logic block finishes before the clock period, it has to idle till the next input is latched in on the next system clock edge. - The use of a latch based methodology (as illustrated in Figure 10.26) enables more flexible timing, allowing one stage to pass slack to or steal time from following stages. This flexibility, allows an overall performance increase.
Latch-based clocking --However, there is an important performance related difference. In a latch based system, since the logic is separated by level sensitive latches, it possible for a logic block to utilize time that is left over from the previous logic block and this is referred to as slack borrowing. --This approach requires no explicit design changes, as the passing of slack from one block to the next is automatic. The key advantage of slack borrowing is that it allows logic between cycle boundaries to use more than one clock cycle while satisfying the cycle time constraint. --Stated in another way, if the sequential system works at a particular clock rate and the total logic delay for a complete cycle is larger than the clock period, then unused time or slack has been implicitly borrowed from preceding stages. --This implies that the clock rate can be higher than the worst case critical path! For
Minimum clock period required is 100 ns Refer: Bernstein[98] for slack borrowing
Synchronous Pipelined Datapath Solution: Asynchronous design
Self-Timed Pipelined Datapath This approach assumes that each combinational function has a means of indicating that it has completed a computation for a particular piece of data.
Asynchronous-Synchronous Interface • sample at regular intervals and check its value • if the sampling rate is high enough, no transitions will be missed (Nyquist criterion) • signal is sampled in the middle of a transition (key press)/undefined state (Crash) • circuit that implements decision making function (high/low state) is called synchronizer • an asynchronous signal must be resolved to be either in the high or low state before it is fed into the synchronous environment. A circuit that implements such a decision-making function is called a synchronizer. • a synchronizer needs some time to come to a decision/ waiting helps reduce failure rate
Synchronizers and Arbiters • Arbiter: Circuit to decide which of 2 events occurred first • Synchronizer: Arbiter with clock f as one of the inputs • Problem: Circuit HAS to make a decision in limited time - which decision is not important • Caveat: It is impossible to ensure correct operation • But, we can decrease the error probability at the expense of delay
A Simple Synchronizer metastability • Data sampled on rising edge of the clock since the sampled signal is not synchronized to the clock signal, there is a finite probability that the set-up time or hold time of the latch is violated (the probability is a strong function of the transition frequencies of the input and the clock). As a result, when the clock goes high, there is a chance that the output of the latch resides somewhere in the undefined transition zone.
Arbiters Decides which of the two events has occurred first (Two CPUs demanding same resource) The output consists of two Ack(nowledge) signals that should be mutually exclusive. While Requests may occur concurrently, only one of the Acknowledges isallowed to go high.
PLL-Based Synchronization To generate a higher frequency required by digital circuits, a phase-locked loop (PLL) structure is typically used. A PLL takes an external low-frequency reference crystal frequency signal and multiplies its frequency by a rational number N