140 likes | 334 Views
National Sun Yat-sen University Embedded System Laboratory. A Low-latency GALS Interface Implementation Yuan- Teng Chang; Wei- Che Chen; Hung- Yue Tsai; Wei-Min Cheng; Chang- Jiu Chen; Fu- Chiung Cheng Dept. of Comput . Sci., Nat. Chiao Tung Univ., Hsinchu , Taiwan
E N D
National Sun Yat-sen University Embedded System Laboratory A Low-latency GALS Interface Implementation Yuan-Teng Chang; Wei-Che Chen; Hung-Yue Tsai; Wei-Min Cheng; Chang-Jiu Chen; Fu-Chiung ChengDept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan Circuits and Systems (APCCAS), 2010 IEEE Asia Pacific Conference on Presenter :Ching-Hua Huang
Abstract With the VLSI technology improving rapidly, SoC has been becoming the most important VLSI application. However, clock distribution and low power have already become the two most important issues in SoC design. In addition, it’s also a very important issue to integrate IPs that can perform operations correctly with different clocks. Asynchronous circuits may resolve these problems by removing the “clock” signal. But it’s too hard to implement the whole circuits with asynchronous circuit. The GALS (Globally-Asynchronous Locally-Synchronous) design methodology can balance this problem via separating each synchronous design with asynchronous interface. Thus, each part of the circuit can perform operations with its own clock. The communication between different parts of the circuit can be achieved via asynchronous channels. The GALS provides a reliable communication between different modules. However, the latency of GALS interface may cause performance degradation seriously. Thus how to reduce the latency of GALS interface is significant. In this paper, we implemented a small and simple stretchable-clock based GALS wrapper with low latency in Verilog HDL and synthesized the design with TSMC 0.13μm cell library. We also showed that the wrapper can operate correctly with modules which operate with great different clock frequencies. In addition, we also recommend adding FIFO storage element on the transmission path.
What’s the problem 3 • IPs can perform operations correctly with different clocks. • Synchronous circuits work by “clock” signal • Some drawbacks • Asynchronous circuits work by handshake protocols • high implementation costs and difficulties • GALS (Globally-Asynchronous Locally -Synchronous) design methodology • To integrate both the advantages of Synchronousand AsynchronousCircuits • The latency of GALS cause performance degradation seriously. • A stretchable-clock based GALS wrapper with low latency.
Related work 1. clock skew 2. difficulty in clock distribution 3. worse case performance 4. not modular 5. sensitive to variations in physical parameters 6. synchronization failure 7. noise (EMI) [7] GALS systems GALS was first Appeared in 1984 [5] Asynchronous circuit [1,2,3,4] Some drawbacks of Synchronous circuit [8] GALS has largelatency reducing the latency of asynchronous interface How to deal with these drawbacks handshake protocols high implementation costs and difficulties 1.Input controller 2.Output controller [6] To integrate both the advantages of Syn. and AsynCircuits [9] 1.Pausible clock generator 2.Stretchable clock generator GALSmethodology was proposed The major difference between them is the way to stop the clock [This paper] 4
Proposed method • The new STG (Signal Transition Graph) • Compose with REQ、ACK、stretch、WR(or RD) • The proposed new wrapper • Input controller • Output controller
A B Y 0 0 1 0 1 1 1 0 1 1 1 0 A pausible clock based module 1.Stoppable clock generator 2.The most commonly used approach so far 3.Uses odd number of inverters to generate the local clock signal of the locally synchronous module lclk Ri Ai rclk
A A B B Y Y 0 0 0 0 0 1 0 0 1 1 0 Hold 1 1 0 0 Hold 0 1 1 1 1 0 1 A stretchable clock based module The basic ideais similar to the above approach : stopthe clock when data transfer occurs 2.The major differencewith above approach is the way to stop the clock The symbol "C” represents C-element, aself-timed latch
The proposed new wrapper – output controller =0 =1 =0 =1 =0 =1 If receiver needs to receive data =0 =1
The proposed new wrapper – output controller =0 =1 =0 =1 =0 =1 =0 =1
A latch to hold the data The latch is controlledby ACK; data has to be stored correctly in the latchduring the time from ACK+ to ACK- Ifit put a First-In-First-Out (FIFO), the sender could put the data into the FIFOsand get acknowledge earlier. Thus sender will continuecomputation instead of waiting for receiver.
Experimental environment and Compare table • Implemented proposed design • Gate-level in Verilog HDL • Synopsys Design Complier • Be used to synthesize our gate-level design • With TSMC 0.13μm cell library Compare area and latency with two different GALS models proposed[11]
ExperimentalResults clk sender = 555 MHz, clk receiver = 133 MHz clk sender = 133 MHz, clk receiver = 555 MHz 12
Conclusions • This paper propose a new GALS wrapper • Based on four-phase handshake protocol. • Consists of an input controller and an output controller • The Area and Latency are improved. • Compared to the C-elementbased design • The area of the new wrapper is only 30.8% • The latency of the new wrapper is only 39.7% • Compared to the standard cell based design • The area of the new wrapper is only 63.5% • The latency of the new wrapper is only 55%
My comments • This paper list the GALS history and principle for design • Like theGALS concept Synchronous Asynchronous GALS • To ensure operation correctness, the synchronous modules must be stopped when the data transfer occurs • Improving my recognize for GALS • The control of Asynchronous wrapper • STG