380 likes | 522 Views
Introduction to asynchronous circuit design: specification and synthesis. Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky, Intel Corporation, USA Alex Kondratyev, Theseus Logic, USA Luciano Lavagno, Università di Udine, Italy. Outline.
E N D
Introduction to asynchronous circuit design:specification and synthesis Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky, Intel Corporation, USA Alex Kondratyev, Theseus Logic, USA Luciano Lavagno, Università di Udine, Italy
Outline • I: Introduction to basic concepts onasynchronous design • II: Synthesis of control circuits from STGs • III: Advanced topics on synthesis of controlcircuits from STGs • IV: Synthesis from HDL and other synthesis paradigmsNote: no references in the tutorial
Introduction toasynchronous circuit design: specification and synthesis Part I: Introduction to basic concepts on asynchronous circuit design
Outline • What is an asynchronous circuit ? • Asynchronous communication • Asynchronous logic blocks • Micropipelines • Control specification and implementation • Delay models • Why asynchronous circuits ?
Synchronous circuit R CL R CL R CL R CLK Implicit synchronization
Asynchronous circuit Ack R CL R CL R CL R Req Explicit synchronization: Req/Ack handshakes
Synchronous communication • Clock edges determine the time instants where data must be sampled • Data wires may glitch between clock edges (set-up/hold times must be satisfied) • Data are transmitted at a fixed rate(clock frequency) 1 1 0 0 1 0
Dual rail • Two wires per bit • “00” = spacer, “01” = 0, “10” = 1 • n-bit data communication requires 2n wires • Each bit is self-timed • Other delay-insensitive codes exist 1 1 1 0 0 0
Bundled data • Validity signal • Similar to an aperiodic local clock • n-bit data communication requires n+1 wires • Data wires may glitch when no valid • Signaling protocols • level sensitive (latch) • transition sensitive (register): 2-phase / 4-phase 1 1 0 0 1 0
Example: memory read cycle • Transition signaling, 4-phase Valid address Address A A Valid data Data D D
Example: memory read cycle • Transition signaling, 2-phase Valid address Address A A Valid data Data D D
Asynchronous modules • Signaling protocol:reqin+ start+ [computation] done+ reqout+ ackout+ ackin+reqin- start- [reset] done- reqout- ackout- ackin-(more concurrency is also possible, e.g. by overlapping the return-to-zero phase of step i-1 with the evaluation phase of step i) DATA PATH Data IN Data OUT start done req in req out CONTROL ack in ack out
A C Z B A B Z+ 0 0 0 0 1 Z 1 0 Z 1 1 1 Asynchronous latches: C element Vdd A B Z B A Z B A Z A B Gnd
A.t C.t B.t A.f C.f B.f Dual-rail logic Dual-rail AND gate Valid behavior for monotonic environment
C done Completion detection tree Completion detection • • • • • •
Differential cascode voltage switch logic start Z.f Z.t done A.t C.f B.f A.f B.t C.t start 3-input AND/NAND gate
Bundled-data logic blocks logic • • • • • • start done delay Conventional logic + matched delay
C C C delay delay delay Micropipelines (Sutherland 89) Aout Ain C L logic L logic L logic L Rin Rout
Data-path / Control L logic L logic L logic L Rin Rout CONTROL Ain Aout
Control specification A+ A B+ B A- A input B output B-
Control specification A+ B+ B A A- B-
Control specification A+ B- B A A- B+
C Control specification A+ B+ A C+ C B A- B- C-
C Control specification A+ B+ A C+ C A- B B- C-
Ro+ Ri+ Ri Ro FIFO cntrl Ao+ Ai+ Ao Ai Ro- Ri- C C Ai- Ao- Ri Ro Ao Ai Control specification
A simple filter: specification IN Ain Rin y := 0; loop x := READ (IN); WRITE (OUT, (x+y)/2); y := x; end loop filter Aout Rout OUT
+ OUT x y IN Ry Ay Rx Ax Ra Aa Rin Rout control Ain Aout A simple filter: block diagram • x and y are level-sensitive latches (transparent when R=1) • + is a bundled-data adder (matched delay between Ra and Aa) • Rin indicates the validity of IN • After Ain+ the environment is allowed to change IN • (Rout,Aout) control a level-sensitive latch at the output
+ OUT x y IN Ry Ay Rx Ax Ra Aa Rin Rout control Ain Aout Rout+ Ra+ Ry+ Rx+ Rin+ Aout+ Aa+ Ay+ Ax+ Ain+ Rout- Ra- Ry- Rx- Rin- Aout- Aa- Ay- Ax- Ain- A simple filter: control spec.
Rx Ax Aa Ry Ra Ay Aout C Ain Rout Rin Rout+ Ra+ Ry+ Rx+ Rin+ Aout+ Aa+ Ay+ Ax+ Ain+ Rout- Ra- Ry- Rx- Rin- Aout- Aa- Ay- Ax- Ain- A simple filter: control impl.
Rx Ax Aa Ry Ra Ay Aout C Ain Rout Rin Ra- Aa- Ain+ Rin- Control: observable behavior z Ain- Rin+ Rx+ Ry- z- Ax- Rx- Ay+ Ay- Ax+ Ra+ Aa+ Rout+ Aout+ z+ Rout- Aout- Ry+
x’ z+ x- x y z’ z x+ y+ z- y- Taking delays into account • Delay assumptions: • Environment: 3 times units • Gates: 1 time unit events: x+ x’- y+ z+ z’- x- x’+ z- z’+ y- time: 3 4 5 6 7 9 10 12 13 14
z+ x- x+ y+ z- y- Taking delays into account x’ x y z’ z very slow Delay assumptions: unbounded delays events: x+ x’- y+ z+ x- x’+ y- failure ! time: 3 4 5 6 9 10 11
Gate vs wire delay models • Gate delay model: delays in gates, no delays in wires • Wire delay model: delays in gates and wires
DI Delay models for async. circuits • Bounded delays (BD): realistic for gates and wires. • Technology mapping is easy, verification is difficult • Speed independent (SI): Unbounded (pessimistic) delays for gates and “negligible” (optimistic) delays for wires. • Technology mapping is more difficult, verification is easy • Delay insensitive (DI): Unbounded (pessimistic) delays for gates and wires. • DI class (built out of basic gates) is almost empty • Quasi-delay insensitive (QDI): Delay insensitive except for critical wire forks (isochronic forks). • Formally, it is the same as speed independent • In practice, different synthesis strategies are used BD SI QDI
Motivation (designer’s view) • Modularity • Plug-and-play interconnectivity • Reusability • IPs with abstract timing behaviors • High peformance • Average-case performance (no worst-case delay synchronization) • No clock skew (local timing assumptions) • Many interfaces are asynchronous • Buses, networks, ...
Motivation (technology aspects) • Low power • Automatic clock gating • Electromagnetic compatibility • No peak currents around clock edges • Robustness • High immunity to technology and environment variations (in-die variations, temperature, power supply, ...)
Dissuasion • Concurrent models for specification • CSP, Petri nets, ...: no more FSMs • Difficult to design • Hazards, synchronization • Complex timing analysis • Difficult to estimate performance • Difficult to test • No way to stop the clock
But ... some successful stories • Philips • AMULET microprocessors • Sharp • Intel (RAPPID) • IBM (interlocked pipeline) • Start-up companies: • Theseus Logic, Cogency • ...