1 / 35

Synthesis of Speed Independent Circuits Based on Decomposition

This paper discusses the synthesis of speed independent circuits using decomposition techniques, aiming to reduce the cost and improve efficiency.

Download Presentation

Synthesis of Speed Independent Circuits Based on Decomposition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synthesis of Speed Independent Circuits Based on Decomposition Tomohiro Yoneda National Institute of Informatics Tokyo Institute of Technology Hiroomi Onda Tokyo Institute of Technology Chris Myers University of Utah

  2. Background • High-level synthesis • plays an important role to push Async. design to wide use • Major approach to high-level synthesis • Prepare basic cells that correspond to specification language constructs • Translate specifications to basic cell networks syntax-directedly with local optimizations • Very efficient • Global optimization may be difficult 2004/4/21 Async2004

  3. Challenge • Our approach to high-level synthesis • Translate high-level spec to low-level spec (time Petri nets) • Use timed logic synthesis technique • Global optimization can be possible by • logic optimization • timing information • Cost for synthesis is very high 2004/4/21 Async2004

  4. How to reduce the cost • Translation technique to low-level spec • guarantees that low-level spec has CSC • by adding state variables sufficiently • Idea: [Yoneda,Myers 2003] • Developing Balsa Compiler • Efficient logic synthesis technique • decomposes low-level spec w.r.t. each output • synthesizes each sub-circuit from each sub-spec Goal of this work In this paper, speed independent circuit synthesis is discussed 2004/4/21 Async2004

  5. Decomposition based synthesis • Input • STG • 1 safe • output semi-modular • with CSC (Complete State Coding) • several more restrictions • Output • Reduced STG for each output • g-C or atomic-gate implementation is synthesizable • Feature • Only state graphs for reduced STGs are necessary • It is not necessary to explore the reachable states of the original STGs 2004/4/21 Async2004

  6. Key issue - input set determination ack1 req1 csc req2 gC csc req1 ack1 gC req1 ack1 csc ack2 csc gC req1 ack1 2004/4/21 Async2004

  7. Reduction Key issue - input set determination ack1 req1 csc req2 gC csc req1 ack1 gC req1 ack1 csc ack2 csc gC req1 ack1 2004/4/21 Async2004

  8. Related works • Synthesizing each output separately • T.A. Chu, Synthesis of Self-Timed VLSI Circuits from Graph-theoretic Specification, PhD thesis, MIT,1987 • No idea for input set determination • R. Puri, J. Gu, A Modular Partitioning Approach for Asynchronous Circuit Synthesis, IEEE TCAD, 1995 • Input set determination is performed based on the state graph of the original STG • Input signals are kept, if hiding them does not increase the number of CSC conflicts • W. Vogler, R. Wollowski, Decomposition in Asynchronous Circuit Design, Tech Report, Univ. Augsburg, 2002 • STG reduction technique - net contraction - is formalized • No general idea for input set determination 2004/4/21 Async2004

  9. Our approach Step 1: Select possible trigger signals as the initial input set Step 2: Contract the original STG by deleting signals except for the output and those in the current input set Step 3: If the reduced STG has CSC, done Step 4: Otherwise, choose appropriate signals and add them to the input set Step 5: Goto Step 2 2004/4/21 Async2004

  10. Possible trigger signals Contraction Original STG Reduced STG contraction bisimilar translation (i.e., by W. Vogler, R. Wollowski) 2004/4/21 Async2004

  11. Issues to be discussed • If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? • How can appropriate signals be chosen without the state graph of the original STG? • How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

  12. Issues to be discussed • If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? • How can appropriate signals be chosen without the state graph of the original STG? • How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

  13. An example (a b c x) b+ 0100 0000 ES(x+) a+/1 c+ 0110 110R a+/1 c+ x+ a-/2 111R 1101 x+ c+ 1111 1000 a+/2 a-/1 c- 011F 0000 1010 x- c- a+/2 0110 0010 b- ES(x-) 2004/4/21 Async2004

  14. If a is deleted (a b c x) CD(ES(x+)) b+ 0100 0000 a+/1 c+ 0110 110R a+/1 c+ x+ a-/2 111R 1101 x+ c+ 1111 1000 a+/2 a-/1 c- 011F 0000 1010 x- c- a+/2 0110 0010 b- CD(ES(x-)) CD(S): Extended set of S by deleting signals 2004/4/21 Async2004

  15. Irrelevant input set • A set D of signals is an irrelevant inputset for an output x, if • D  In  Out – {x} • CD(ES(x+)) – UR = ES(x+) • CD(ES(x–)) – UR = ES(x–) • In: Input signal set of the original STG • Out: Output signal set of the original STG • UR: Unreachable state set of the original STG 2004/4/21 Async2004

  16. If a is deleted (a b c x) CD(ES(x+)) b+ 0100 0000 a+/1 c+ 0110 110R a+/1 c+ x+ a-/2 111R 1101 x+ CD(ES(x+)) – UR  ES(x+) CD(ES(x–)) – UR  ES(x–) c+ 1111 1000 a+/2 a-/1 c- 011F 0000 1010 {a} is not an irrelevant input set x- c- a+/2 0110 0010 b- CD(ES(x-)) If a non-irrelevant input set is deleted, the reduced STG has no CSC 2004/4/21 Async2004

  17. If c is deleted (a b c x) CD(ES(x+)) b+ 0100 0000 a+/1 c+ 0110 110R a+/1 c+ x+ a-/2 111R 1101 x+ CD(ES(x+)) – UR = ES(x+) CD(ES(x–)) – UR = ES(x–) c+ 1111 1000 a+/2 a-/1 c- 0101 011F 0000 1010 x- c- {c} is an irrelevant input set a+/2 0110 0010 b- CD(ES(x-)) If an irrelevant input set (including no possible trigger signals) is deleted, a correct circuit is obtained from the reduced STG 2004/4/21 Async2004

  18. Theorem 1 • For an STG G that has CSC and is output semi-modular, if a reduced STG G' obtained from G by deleting some signal set V (including no possible trigger signals) has CSC, then a correct circuit is obtained from G' If V is not an irrelevant input set, G' must not have CSC V must be an irrelevant input set A correct circuit is obtained from G' 2004/4/21 Async2004

  19. Issues to be discussed • If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? • How can appropriate signals be chosen without the state graph of the original STG? • How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

  20. Possible trigger signals Contraction with initial input set Original STG Reduced STG contraction 2004/4/21 Async2004

  21. 1R CSC conflict 10 Checking CSC • Constructing state graph of the reduced STG Reduced STG 00 a+/1 1R x+ 11 a-/2 a-/1 0F x- 00 a+/2 10 2004/4/21 Async2004

  22. Guided Simulation abstracted trace original trace State graph of the original STG 00 b+ 0100 0000 a+/1 a+/1 c+ 1R 0110 noninterface transition 110R a+/1 x+ c+ x+ 111R 11 1101 x+ a-/1 c+ 1111 interface transition 0F 1000 a+/2 a-/1 x- c- 011F 0000 00 1010 x- c- a+/2 This can be obtained by simulating the original STG not requiring the state graph of the original STG a+/2 0110 0010 b- 10 2004/4/21 Async2004

  23. Generating original trace Original STG t1 t2 noninterface transitions abstracted trace: a+ b+ t3 b+ interface transitions original trace: t2 t3 a+ b+ a+ 2004/4/21 Async2004

  24. Analysis of original trace b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 1000 a+/2 a-/1 011F 0000 x- c- 0110 0010 b- Find a noninterface signal that certainly changes odd times here 2004/4/21 Async2004

  25. Analysis of original trace b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 Add b to the input set 1000 a+/2 a-/1 011F 0000 x- c- Resolve this CSC conflict 0110 0010 b- 2004/4/21 Async2004

  26. concurrent Analysis of original trace b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 But,c does not actually 1000 a+/2 a-/1 011F 0000 c also seems to satisfy this condition x- c- 0110 0010 b- Select a noninterface signal that certainly changes odd times here 2004/4/21 Async2004

  27. Formalization original trace (init) • w is odd-confined by f1 : • w changes odd times in f1 • if w changes in f0, thenwe1e1 • e1 ws2 • we2 e2 we1 last w f0 e1 CSC conf. ws2 first w "" represents causality relation obtained from structure of STG f1 we2 last w e2 interface signal CSC conf. 2004/4/21 Async2004

  28. Analysis of original trace f0 b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 1000 a+/2 a-/1 011F 0000 x- c- f1 0110 0010 b- b is odd-confined byf1 c is not odd-confined byf1 2004/4/21 Async2004

  29. Theorem 2 f0 • If w(andui)satisfies the following • condition, adding w(andui) resolves • the CSC conflict inf1 •           (sufficient condition) • w is odd-confined byf1 • w does not changes inl • If w changes before the first interface signal, for each oddi ui is odd-confined byhi with causality relation shown in the figure 0 110R w+ 1 110R 1 110R w- f1 w+ 1 110R interface transition h1 1 1101 • For one CSC conflict, there exist • many candidate sets of signals •          ↓ • Analyze every CSC conflict • Set up the covering problem • and solve it h3 u+ 1 1100 interface transition l 1 1001 2004/4/21 Async2004

  30. Drawback • For an STG with conflicting transitions, backtracking may be needed • Finding actually fired noninterface transitions is no longer deterministic due to deleting conflicting transitions • If many conflicting transitions exist, backtracking sometimes costs a lot • Approaches that seem practical are to • Keep all conflicting transitions even if they are not related to backtracking, or • Manually specify some of necessary conflicting transitions • Our compiler from a high-level language can automatically specify those conflicting transitions 2004/4/21 Async2004

  31. Issues to be discussed • If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? • How can appropriate signals be chosen without the state graph of the original STG? • How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

  32. Experimental results • Experiments • Implementation of the proposed method in C • Pentium 2.8GHz, 4GB memory • Final logic synthesis tool: petrify -gc -eqn • Benchmarks • Instruction cache controller of TITAC2 • generated from high-level spec by our compiler • large, but simple → input signal sets are small • compiler decisions are used for specifying conflicting transitions • Controllers of various filters • manually designed • medium, but complicated → input signal sets are large • all conflicting transitions are kept • Async Benchmarks • small and simple • all conflicting transitions are kept 2004/4/21 Async2004

  33. Experimental results • CPU times, Memory usage • Benchmark1: significantly reduced • Benchmark2: reduced • Quality of synthesized circuits • Area (num. of transistors): almost no overhead 2004/4/21 Async2004

  34. Experimental results • For Async Benchmarks • Quality : no overhead (exactly the same area) • Cost : advantageous only for largest specs CPU times (sec) Proposed Petrify 2004/4/21 Async2004

  35. Conclusion • New algorithm to find input signal sets for decomposition based synthesis method • state graph of the original STG is not necessary • can handle larger circuits • Logic synthesis tool : NUTAS • Linux binary is downloadable from http://research.nii.ac.jp/~yoneda • Future works • extend the algorithms to support timed circuit synthesis • finish the compiler development for high-level synthesis and integrate both 2004/4/21 Async2004

More Related