280 likes | 650 Views
High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer. Yulei Zhang 1 , James F. Buckwalter 1 , and Chung- Kuan Cheng 2 1 Dept. of ECE, 2 Dept. of CSE, UC San Diego, La Jolla, CA 19 th Conference on Electrical Performance of Electronic Packaging and Systems
E N D
High-Speed and Low-Power On-Chip Global Link Using Continuous-Time Linear Equalizer Yulei Zhang1,James F. Buckwalter1, and Chung-Kuan Cheng2 1Dept. of ECE, 2Dept. of CSE, UC San Diego, La Jolla, CA 19th Conference on Electrical Performance of Electronic Packaging and Systems Oct 25, 2010 Austin, USA
Outline • Introduction • Equalized On-Chip Global Link • Overall structure • Basic working principle • Driver Design for On-Chip Transmission-Line • Guideline for tapered CML driver • Driver design example • Continuous-Time Linear Equalizer (CTLE) Design • CTLE modeling • CTLE design example • Driver-Receiver Co-Design for Low Energy per Bit • Methodology • Overall link design example • Conclusion
Research Motivation • Global interconnect planning becomes a challenge in ultra-deep sub-macron (UDSM) process • Performance gap between global wire and logic gates • Conventional buffer insertion brings in larger extra power overhead • Uninterrupted wire configurations are used to tackle the on-chip global communication issues • On-chip T-lines to reduce interconnect power • Equalization to improve the bandwidth • State-of-the-art[Kim2009] • 2Gb/s/um, < 1pJ/b, signaling over 10mm global wire in 90nm
Our Contributions • Contributions • Build up a novel equalized on-chip T-line structure for global communication • Tapered CML driver + CTLE receiver • Accurate small-signal modeling on CTLE receiver to improve the optimization quality • A design methodology to achieve driver-wire-receiver co-optimization to reduce the total energy per bit • Results of our design • 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line • 11ps/mm latency and 0.2pJ/b energy per bit in 45nm
Equalized On-Chip Global Link • Overall structure • Tapered current-mode logic (CML) drivers • Terminated differential on-chip T-line • Continuous-time linear equalizer (CTLE) receiver • Sense-amplifier based latch
Basic Working Principle • Tapered CML Driver • Provide low-swing differential signals to driver T-line • Tapered factor u, number of stages N, fan-out X,final stage current ISS, driver resistance RS • T-line • Differential wire w/ P/G shielding • Geometries (width, pitch) and termination resistance RT • CTLE Receiver • Recover signal and improve eye-quality • Load resistance RL, source degeneration resistance RD and capacitance CD, over-drive voltage Vod. • Sense-amplifier based latch • Synchronize and convert signal back to digital level
Tapered CML Driver Design Need to design: Output resistance RS Tail current ISS Size of transistors W • Output swing constraint • Design guideline [Tsuchiya2006, Heydari2004] • Begin from the final stage • For given VSW, output resistance RS optimized with RT to increase eye-opening • Transistor size • Tapered factor u = 2.7 for delay reduction • Number of stages • Each previous stage is designed backward by scaling with the factor u
CML Driver Study w/ Loaded T-line Assume 45nm 1P11M CMOS T-line built on M9 with M1 as reference T = 1.2um, H = 3.5um (fixed) Optimize W and S for eye-opening Change of the eye-opening with width for fixed 2um pitch Change of the eye-opening with pitch for equal width/spacing
CML Driver Design Example • Experimental observations • Optimal eye happens when width=spacing • Eye-opening improves with larger pitch • Design methodology • Choose the minimum pitch that satisfied the wire-end eye-opening requirement • Design example
Accurate CTLE Modeling Design Variables: RL, RD, CD, Vod(Size) [Hanumolu2005] Small Signal Circuit to derive H(s):
CTLE Modeling Validation <10% correlation error >20% eye-opening increase • Test case:10mm, 16mV-eye@wire-end • Blue lines: simple modeling, not consider rds and parasitics • Red line: only consider rds • Black line: the proposed accurate model
CTLE Design Example • Observations of CTLE study • Eye-opening improves with relaxed power constraints but tends to be saturated • Design example • Based on the pre-optimized CML driver + T-line design • Eye-opening improved by 4X after CTLE
Driver-Receiver Co-Design • Methodology • Optimize driver-wire-receiver together by setting Veye/Power as the cost function • Choose pre-designed CML/T-line/CTLE as initial solution • Optimization Flow • Driver-to-receiver step-response generation based on SPICE simulation and CTLE modeling • Eye-opening estimation based on step-response • SQP-based non-linear optimization • Variables: [ISS,RT,RL,RD,CD,Vod] • Performance Comparison • Option A:Driver/Receiver independent design • Option B:Low-power driver/receiver co-design
Low Energy-per-Bit Optimization Flow Pre-designed CML driver Pre-designed CTLE receiver Driver-Receiver Co-Design Initial Solution Change variables • [ISS,RT,RL,RD,CD,Vod] Cost-Function • Veye/Power Co-Design Cost Function Estimation Step-Response Based Eye Estimation SPICE generated T-line step response Receiver Step-Response using CTLE modeling Internal SQP (Sequential Quadratic Optimization) routine to generate best solution Best set of design variables in terms of overall energy-per-bit
Simulated Eye Diagrams Methodology A: driver/receiver separate design Methodology B: driver/receiver co-design for low-power
Summary of Performance Comparison Note: driver/receiver co-design methodology uses much larger driver/termination resistance to reduce power, but will close the eye-opening at the driver output and wire-end. Final eye is recovered by fully utilizing CTLE.
Conclusion • We propose a novel equalized on-chip global link using CML driver and CTLE receiver • Accurate modeling for CTLE is provided to achieve <10% correlation error and will improve eye-opening optimization quality • Our design achieves • 20Gbps signaling over 10mm, 2.2um-pitch on-chip T-line • 11ps/mm latency and 0.2pJ/b energy
Thank You! Q & A