300 likes | 321 Views
Introduction to asynchronous circuit design: specification and synthesis. Part IV: Synthesis from HDL Other synthesis paradigms. Outline. Synthesis from standard HDL (Verilog) [L. Lavagno et al Async00] Subset for asynchronous specification Data-path/control partitioning
E N D
Introduction toasynchronous circuit design: specification and synthesis Part IV: Synthesis from HDL Other synthesis paradigms
Outline • Synthesis from standard HDL (Verilog) [L. Lavagno et al Async00] • Subset for asynchronous specification • Data-path/control partitioning • Circuit architecture. Control generation • Synthesis from asynchronous HDL (CSP, Tangram) • CSP for control generation [A. Martin et al, Caltech] • Tangram for silicon compilation [K. van Berkel et al, Philips] • Control synthesis using FSMs [K. Yun, S. Nowick] • Burst-mode machines • Comparison with STGs • Disclaimer: this is NOT a comprehensive review
Motivation • Language-based design key enabler to synchronous logic success • Use HDL as single language for • specification • logic simulation and debugging • synthesis • post-layout simulation • HDL must support multiple levels of abstraction
Control-data partitioning • Splitting of asynchronous control and synchronous data path • Automated insertion of bundling delays CONTROL UNIT request DATA PATH delay acknowledge
Design flow HDL specification Synthesizable HDL (data) Control/data splitting STG (control) Synthesis (Synopsys) Logic delays Synthesis (petrify) Timing analysis (Synopsys) Logic implementation HDL implementation Delay insertion
Asynchronous Verilog subset by example • begin-end for sequencing, fork-join for concurrency, if-else for input choice • Only structured mix of sequencing, concurrency and choice can be specified always begin wait(start); R = SMP * 3; RES = SMP * 4 + R; if(RES[7] == 1) RES = 0; else begin if(RES[6] == 1) RES = 1; end; done = 1; wait(!start); done = 0; end SMP R R E S RES C.U. start done
Controller design flow HDL Syntax-directed translation Petri Net Transformations Reductions Trace Expressions Synthesis Circuit
|| || ; d e a b c Trace expressions: example ( a || ( b ; c) )|| (d e)
Reduction Example a d;a; ( b || f ) f b e c c h g; h;e d g
|| a ; ; ; b f c d Transformation: concurrency reduction Concurrency in TE: b and f have a common parallel father a f b c d
|| a ; ; ; b f c d Transformation: concurrency reduction f and b are ordered a f b ; c d
Synthesis • Place-based encoding ( based on a David-cell approach) • Transformations to improve area and performance • Structural methods to derive a circuit [Pastor et al.] Transactions on CAD, Nov’98
p1- p2- Place-based encoding p2+ p1+ p2 p1 1100 p3+ t1 ER(t1) = 111- t1 p3 0010 p4+ t2 ER(t2) = --11 t2 p3- p4 0001 p4-
Synthesis example: VME bus ldtack+ p2+ p1- LDS+ p8- p11- p3+ lds+ D+ LDTACK+ DSr+ LDTACK- p1+ p2- p7- p4+ p10- dsr+ dtack+ D+ DTACK- LDS- ldtack- p8+ p3- Place encoding p11+ p5+ DTACK+ D- p9- p6- dsr- lds- dtack- p4- DSr- p6+ p9+ p10+ p7+ D- p5-
VME bus spec after transforms ldtack+ p2+ ldtack+ p1- p8- p11- d+ lds+ p3+ lds+ D+ dtack+ p1+ dsr+ p2- p7- dsr- p9+ ldtack- p9- p4+ p10- dsr+ dtack+ ldtack- p8+ lds- dtack- Reductions Transforms p3- p11+ p5+ d- p9- p6- dsr- lds- dtack- p4- p6+ p9+ p10+ p7+ D- p5-
x+ p1 000 p5 p2 -0- 1-0 y- z+ y+ p3 1-1 p6 x- -1- p7 010 p4 0-1 z- Deriving Next state function Next-state function of signal y ?
x+ p1 000 p5 p2 10- -01 1-0 y- z+ y+ p3 1-1 p6 x- 11- -11 p7 010 p4 0-1 z- Deriving Next State function Next-state function of signal y ? y = x + z
Conclusion • Initial prototype of automated flow without state explosion for ASIC design • From HDLs (control / data splitting) • Existing tools for data-path synthesis • Direct synthesis guarantees implementation(HDL Petri net, Petri-net-based encoding) • Synthesis of large controllers by efficient spec models (Free-choice Petri nets + trace expressions) • Exploration of the design space (optimization) by property-preserving transformations • Logic synthesis by structural methods • Quality of design often acceptable • Timing post-optimization can be applied
Synthesis from asynchronous HDL • CSP based languages • CSP = communicating sequential processes [T. Hoare] • Two synthesis techniques • based on program transformations [Caltech] • based on direct compilation [Philips] • Tools are more mature than for asynchronous synthesis from standard HDL • Complete shift in design methodology is required
Using CSP for control generation • After li goes high do full handshake at the right, then complete handshake at the left and iterate. ro li Q element ri lo STG: li+ ro+ ri+ ro- ri- lo+ li- lo- *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] CSP: • “;” = sequencing operator • ro+ = ro goes high; ro- = ro goes low • [li] = wait until li is high; [not li] = wait until li is low
Using CSP for control generation *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] CSP: weak ri Production rules: li -> ro+; ri -> ro- not ri -> lo+; not li -> lo- ro li • Conflict: ro+ and ro- are not mutually exclusive (since ri+ and li+ are not) • Eliminate conflict by state signal insertion (= CSC)
Conflict elimination *[[li];ro+;[ri];x+;[x];ro-;[not ri];lo+;[not li];x-;[not x];lo-] CSP: Production rules: not x and li -> ro+; x or not li -> ro- x and not ri -> lo+; not x or ri -> lo- ri -> x+; not li -> x- ro li x FF not x lo ri
Conclusions • Generating circuits from CSP control program is similar to STG synthesis • One can be reduced to the other • Particular technique may vary. Direct CSP program transformations can be (and were) used instead of methods based on state space generation • See reference list for more details
x * T T Buffer example in Tangram (a?byte & b!byte) begin x0: var byte | forever do a?x0 ; b!x0 od end a b Buffer passive port Each circle mapped to a netlist active port ; Q element a b Data path
Summary • Tangram program is partitioned into data path and control • Data path is implemented as dual or single rail • Control is mapped to composition of standard elements (“;” “||” etc) • Each standard element is mapped to a circuit • Post-optimization is done • Composing islands of control elements and re-synthesis with STG can give more aggressive optimization • Philips made a few chips using Tangram, including a product: 8051 micro-controller in low-power pager Muna (25 wks battery life from one AAA battery) • Similar approach used in Balsa(Manchester Univ., public domain)
Burst mode FSM • Close to synchronous FSMs with binary encoded I/O • Work in bursts: • Input transitions fire • Output transitions fire • State signals change • Mostly limited to fundamental mode: next input burst cannot arrive before stabilization at the outputs s1 b-/x- a+b+/y+ a-/x+y- s2 s4 c-/y+ c+/y- s3
Extended Burst mode • Directed don’t cares (b*): some concurrency is allowed for input transitions that do not influence an output burst • Conditional guards <b+> = “if b=1 then …” s1 b-/x- a+b*/y+ <b+>a-/x+y- s2 s4 c-/y+ <b+>c+/y- s3
Synthesis of XBM • Next state and output functions free of functional and logic hazards • Sequential feedbacks should not introduce new hazards • State assignment • one state of the BM spec to one layer of Karnaugh map • compatible layers are merged • layers are compatible if merging does not introduce CSC violations or hazards • Layers are encoded using race free encoding
XBM and STG x- a+ b+ s1 b-/x- a+b*/y+ y+ <b+>a-/x+y- s2 s4 c-/y+ <b+>c+/y- a- c+ s3 eps y- c- y- x+ y+ b-
Summary • Specification: XBM is subclass of STGs • Synthesis: techniques are extensions of synchronous state assignment and logic minimization • Timing: • environment is limited to fundamental mode (difficult for pipelined and highly concurrent systems) • internals are delay insensitive • See reference list for details