120 likes | 130 Views
This study explores the advantages of using a ternary bus for on-chip interconnects, providing energy-efficient solutions. The proposed bus design enhances speed and power efficiency, addressing the challenges of modern VLSI design.
E N D
Energy Efficient and High Speed On-Chip Ternary Bus Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Sunil P. Khatri Texas A&M University, College Station, TX, USA
Motivation • Trends in VLSI design • Shrinking feature size • Deep SubMicron (DSM) and Very Deep SubMicron (VDSM) processes • Scaling down supply voltage • Increasing die-size (e.g. SoC, NoC, CMP) • Impacts • Smaller gate delay (high speed logic) • Lower switching power per gate • High complexity (>billion gates) • Increasing power consumption • Higher leakage current (standby power) • Reduced noise margin • Increasing interconnect delay • Interconnect delay >> gate delay • Global interconnect becomes the performance bottleneck
CI CI CI CI T CL CL CL CL CL CL On-chip Bus Interconnects • The impact of DSM / VDSM: • W↓,P↓ • L↑, T↑ • to avoid quadratic increase in resistance of the wire: • Inter-wire capacitance CIis much greater than substrate capacitance CL, → crosstalk becomes dominant • λ= CI / CL > 10 for metal 4 in a 0.1mm CMOS process W P Earlier process DSM process
Ternary Bus and Mapping • Advantage of a ternary bus • low voltage step: Vdd/2 instead of Vdd • We propose a bit-to-bit binary-ternary mapping scheme • Each binary bit is mapped directly to a line on the ternary bus. • A binary 0 is mapped to a middle value on the ternary bus. i.e. 0b->0t. • A binary 1 is mapped to either high or low value on the ternary bus. i.e. 1b+ or 1b- . • Disadvantage: lower bit density (1 bit/line vs 1.58 bit/line for true ternary bus) • Advantages: direct mapping and flexible polarity • Ternary to binary conversion is very slow and complex • Flexible polarity results in low crosstalk. e.g., the ternary vectors +0+, -0-, +0- and -0+all represent the same binary value 101. • Each ternary value is represented by the polarity Pj and the magnitude Dj Ternary driver truth table
Crosstalk in a Multi-valued Bus • Define the effective crosstalk as • where dj,k = sgn(dj) DVk is the normalized voltage change, and . NOL is the number of logic levels • Delay can be approximated as • for l >> 1, • Energy consumption is • when l >> 1, • For ternary bus, Vstep = Vdd/2, we know • max(Xeff,j)= 8 • min(Xeff,j)=0 • Bus speed/power is highly data pattern dependent! Table 1. Examples of Total Crosstalk
A Low Power, High Speed 4X Ternary Bus • Using direct bit-to-bit mapping • Coding rules: • Rule #1: A direct- ↔ +transition is prohibited. • Rule #2: A 1b0bis mapped as -t0t or +t0t depending only on the current polarity of the 1b. • Rule #3: For a 0b1b transition on bj, if bj-1 is transitioning, Pj is coded so both lines transition in the same direction. • Rule #4: For a 0b1b transition on bj, if bj-1 is not transitioning and and bj+1 is transitioning from 1 to 0, Pj is coded so that the jth and (j+1)th line transition in the same direction. • Rule #5: For a 0b1b transition on bj, if no transition on either neighbor, Pj is coded so {Pj = Pj-1 or Pj = Pj+1} with Pj = Pj-1 having the higher priority. • The 1st rule guarantees max(Xeff,j) = 4, therefore a 2X speed up from a conventional binary bus • The other rules are designed to lower the probability of high value Xeff,j’s occurrence on the bus • Identical encoder/decoder logic for each bit An example of 4X ternary sequences
An Even Faster 3X Ternary Bus • Partition the bus into 5-bit groups • Insert shield wire between groups • Apply the same rules for 4X bus • It can be proven that such a configuration guarantees max(Xeff) = 3 • Additional 33% speed up over 4X ternary bus • At the cost of 20% additional wires 4X bus encoder and driver circuit 3X bus encoder and driver circuit
I ref V dd 2 I r e f I r e f to D j + 1 M 1 M 2 out 1 d ENC d in out 2 out C I R I - driver M 5 M 3 C M 4 bus L w xtalk I - receiver to D j - 1 ( A ) current mode shared V - ref V dd V ref 1 V dd V / 2 V dd dd M 2 to D j + 1 V ref 2 M 1 V ref 1 C I ENC din R V dd M 3 C L bus V - driver d out to D j - 1 V ref 2 V - receiver ( B ) Voltage mode Circuit Implementations • Encoder implemented based on the 5 rules • Decoder is extremely simple (implemented with two 2-input gates) • Ternary driver and receiver can be implemented in current or voltage mode • Current mode is more power hungry (static current) • Voltage mode requires a low impedance Vdd/2 supply
Experimental Results • The power saving comes from the redistribution of the Xeff • More transitions are pushed towards lower Xeff • The average power saving is ~27% Crosstalk distribution and normalized energy consumption comparison (code ternary vs. half-swing binary) 4X: ternary bus using 4X code; HB: half-swing binary bus; RP: ternary bus with random polarity; TT: true ternary bus
Experimental Results • The proposed 4X and 3X busses are advantageous over other bus coding schemes. • EF: Normalized total energy • PDP: power delay product 4XT: ternary bus using 4X code; 3XT: ternary bus with 3X code; SB: binary bus with shielding; HB: half-swing binary bus; RP: ternary bus with random polarity; TT: true ternary bus Bus performance comparison
Experimental Results Eye diagrams for uncoded an coded busses (10mm)
Summary • Crosstalk classification was extended to multi-valued buses • We proposed a direct bit-to-bit binary-ternary mapping scheme which results in a simple CODEC design. • We proposed a 4X coding scheme that allows us to double the speed of a conventional ternary bus and save energy. • We proposed a coding scheme (3X coding) to attain an additional 33% speed gain at the cost of 20% area overhead. • We designed and implemented the CODEC and ternary driver/receiver. • Our experimental results show significant power saving (27%) and speed gain (2X or more) over other schemes