1 / 18

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer. Yulei Zhang 1 , James F. Buckwalter 1 , and Chung- Kuan Cheng 2 1 Dept. of ECE, 2 Dept. of CSE, UC San Diego, La Jolla, CA 19 th Conference on Electrical Performance of Electronic Packaging and Systems

haven
Download Presentation

High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer Yulei Zhang1,James F. Buckwalter1, and Chung-Kuan Cheng2 1Dept. of ECE, 2Dept. of CSE, UC San Diego, La Jolla, CA 19th Conference on Electrical Performance of Electronic Packaging and Systems Oct 25, 2010 Austin, USA

  2. Outline • Introduction • Equalized On-Chip Global Link • Overall structure • Basic working principle • Driver Design for On-Chip Transmission-Line • Guideline for tapered CML driver • Driver design example • Continuous-Time Linear Equalizer (CTLE) Design • CTLE modeling • CTLE design example • Driver-Receiver Co-Design for Low Energy per Bit • Methodology • Overall link design example • Conclusion

  3. Research Motivation • Global interconnect planning becomes a challenge in ultra-deep sub-macron (UDSM) process • Performance gap between global wire and logic gates • Conventional buffer insertion brings in larger extra power overhead • Uninterrupted wire configurations are used to tackle the on-chip global communication issues • On-chip T-lines to reduce interconnect power • Equalization to improve the bandwidth • State-of-the-art[Kim2009] • 2Gb/s/um, < 1pJ/b, signaling over 10mm global wire in 90nm

  4. Our Contributions • Contributions • Build up a novel equalized on-chip T-line structure for global communication • Tapered CML driver + CTLE receiver • Accurate small-signal modeling on CTLE receiver to improve the optimization quality • A design methodology to achieve driver-wire-receiver co-optimization to reduce the total energy per bit • Results of our design • 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line • 11ps/mm latency and 0.2pJ/b energy per bit in 45nm

  5. Equalized On-Chip Global Link • Overall structure • Tapered current-mode logic (CML) drivers • Terminated differential on-chip T-line • Continuous-time linear equalizer (CTLE) receiver • Sense-amplifier based latch

  6. Basic Working Principle • Tapered CML Driver • Provide low-swing differential signals to driver T-line • Tapered factor u, number of stages N, fan-out X,final stage current ISS, driver resistance RS • T-line • Differential wire w/ P/G shielding • Geometries (width, pitch) and termination resistance RT • CTLE Receiver • Recover signal and improve eye-quality • Load resistance RL, source degeneration resistance RD and capacitance CD, over-drive voltage Vod. • Sense-amplifier based latch • Synchronize and convert signal back to digital level

  7. Tapered CML Driver Design Need to design: Output resistance RS Tail current ISS Size of transistors W • Output swing constraint • Design guideline [Tsuchiya2006, Heydari2004] • Begin from the final stage • For given VSW, output resistance RS optimized with RT to increase eye-opening • Transistor size • Tapered factor u = 2.7 for delay reduction • Number of stages • Each previous stage is designed backward by scaling with the factor u

  8. CML Driver Study w/ Loaded T-line Assume 45nm 1P11M CMOS T-line built on M9 with M1 as reference T = 1.2um, H = 3.5um (fixed) Optimize W and S for eye-opening Change of the eye-opening with width for fixed 2um pitch Change of the eye-opening with pitch for equal width/spacing

  9. CML Driver Design Example • Experimental observations • Optimal eye happens when width=spacing • Eye-opening improves with larger pitch • Design methodology • Choose the minimum pitch that satisfied the wire-end eye-opening requirement • Design example

  10. Accurate CTLE Modeling Design Variables: RL, RD, CD, Vod(Size) [Hanumolu2005] Small Signal Circuit to derive H(s):

  11. CTLE Modeling Validation <10% correlation error >20% eye-opening increase • Test case:10mm, 16mV-eye@wire-end • Blue lines: simple modeling, not consider rds and parasitics • Red line: only consider rds • Black line: the proposed accurate model

  12. CTLE Design Example • Observations of CTLE study • Eye-opening improves with relaxed power constraints but tends to be saturated • Design example • Based on the pre-optimized CML driver + T-line design • Eye-opening improved by 4X after CTLE

  13. Driver-Receiver Co-Design • Methodology • Optimize driver-wire-receiver together by setting Veye/Power as the cost function • Choose pre-designed CML/T-line/CTLE as initial solution • Optimization Flow • Driver-to-receiver step-response generation based on SPICE simulation and CTLE modeling • Eye-opening estimation based on step-response • SQP-based non-linear optimization • Variables: [ISS,RT,RL,RD,CD,Vod] • Performance Comparison • Option A:Driver/Receiver independent design • Option B:Low-power driver/receiver co-design

  14. Low Energy-per-Bit Optimization Flow Pre-designed CML driver Pre-designed CTLE receiver Driver-Receiver Co-Design Initial Solution Change variables • [ISS,RT,RL,RD,CD,Vod] Cost-Function • Veye/Power Co-Design Cost Function Estimation Step-Response Based Eye Estimation SPICE generated T-line step response Receiver Step-Response using CTLE modeling Internal SQP (Sequential Quadratic Optimization) routine to generate best solution Best set of design variables in terms of overall energy-per-bit

  15. Simulated Eye Diagrams Methodology A: driver/receiver separate design Methodology B: driver/receiver co-design for low-power

  16. Summary of Performance Comparison Note: driver/receiver co-design methodology uses much larger driver/termination resistance to reduce power, but will close the eye-opening at the driver output and wire-end. Final eye is recovered by fully utilizing CTLE.

  17. Conclusion • We propose a novel equalized on-chip global link using CML driver and CTLE receiver • Accurate modeling for CTLE is provided to achieve <10% correlation error and will improve eye-opening optimization quality • Our design achieves • 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line • 11ps/mm latency and 0.2pJ/b energy

  18. Thank You! Q & A

More Related