590 likes | 1.06k Views
ECE260B – CSE241A Winter 2005 Interconnects and Delay Calculation. Website: http://vlsicad.ucsd.edu/courses/ece260b-w05. Interconnect-Centric Methodology. Conventional component-centric design methodology Interconnect impacts are negligent components characterized by cell libraries
E N D
ECE260B – CSE241AWinter 2005Interconnects andDelay Calculation Website: http://vlsicad.ucsd.edu/courses/ece260b-w05
Interconnect-Centric Methodology • Conventional component-centric design methodology • Interconnect impacts are negligent • components characterized by cell libraries • Modern interconnect-centric design methodology • Interconnects dominate VLSI system performance • Needs accurate interconnect prediction and analysis • Approaches • Hierarchical “time-budgeting” • Top-level “chip-integration” • Slide courtesy of Sylvester/Shepard
Passivation Dielectric Wire Etch Stop Layer Via Global (up to 5) Dielectric Capping Layer Copper Conductor with Barrier/Nucleation Layer Intermediate (up to 4) Local (2) Pre Metal Dielectric Tungsten Contact Plug SEMATECH Prototype BEOL stack, 2000 • What are some implications of reverse-scaled global interconnects? • Slide courtesy of Chris Case, BOC Edwards
Damascene and Dual-Damascene Process • Damascene process named after the ancient Middle Eastern technique for inlaying metal in ceramic or wood for decoration • Single Damascene • Dual Damascene ILD Deposition Oxide Trench / Via Etch Oxide Trench Etch Metal Fill Metal Fill Metal CMP Metal CMP
Cu Dual-Damascene Process Bulk copper removal Cu Damascene Process Barrier removal Oxide over-polish • Polishing pad touches both up and down area after step height • Different polish rates on different materials • Dishing and erosion arise from different polish rates for copper and oxide Oxide erosion Copper dishing
Area Fill & Metal Slot for Copper CMP Copper • Dishing can thin the wire or pad, causing higher-resistance wires or lower-reliability bond pads • Erosion can also result in a sub-planar dip on the wafer surface, causing short-circuits between adjacent wires on next layer • Oxide erosion and copper dishing can be controlled by area filling and metal slotting Oxide Metal Slot Area Fill
Resistance & Sheet Resistance L r R = T W Sheet Resistance L R T R R 1 2 W • Resistance seen by current going from left to right is same in each block
Bulk Resistivity • Aluminum dominant until ~2000 • Copper has taken over in past 4-5 years • Copper as good as it gets
Capacitance: Parallel Plate Model ILD = interlevel dielectric L W T Bottom plate of cap can be another metal layer H SiO ILD 2 Substrate Cint = eox * (W*L / tox)
Line Dimensions and Fringing Capacitance Lateral cap w S • Line dimensions: W, S, T, H • Sometimes H is called T in the literature, which can be confusing
Inductance • V = L d I/d t V2 = M12 d I1/d t • Faraday’s law V = N d (B A) / d t B = m (N / l) I L = m N2 A / l V = voltage N = number of turns of the coil B = magnetic flux A = area of magnetic field circled by the coil l = height of the coil t = time • At high frequencies, can be significant portion of total impedance Z = R + jwL (w = 2pf = angular freq) Slide courtesy of Ken Yang, UCLA
Inductance is Important… e.g. • Faster clock speeds • Frequency of interest is determined by signal rise time, not clock frequency • Copper interconnects R is reduced • Thick, low-resistance (reverse-scaled) global lines • Chips are getting larger long lines large current loops Massoud/Sylvester/Kawa, Synopsys • Slide courtesy of Massoud/Sylvester/Kawa, Synopsys
On-Chip Inductance • Inductance is a loop quantity • Knowledge of return path is required, but hard to determine • For example, the return path depends on the frequency Signal Line Return Path Massoud/Sylvester/Kawa, Synopsys • Slide courtesy of Massoud/Sylvester/Kawa, Synopsys
Signal Gnd Gnd Gnd Gnd Gnd Gnd Signal Gnd Gnd Gnd Gnd Gnd Gnd Frequency-Dependent Return Path • At low frequency, and current tries to • minimize impedance • minimize resistance • use as many returns as possible (parallel resistances) • At high frequency, and current tries to • minimize impedance • minimize inductance • use smallest possible loop (closest return path) L dominates, current returns “collapse” • Power and ground lines always available as low-impedance current returns • Slide courtesy of Massoud/Sylvester/Kawa, Synopsys
Inductance vs. Capacitance • Capacitance • Locality problem is easy: electric field lines “suck up” to nearest neighbor conductors • Local calculation is hard: all the effort is in “accuracy” • Inductance • Locality problem is hard: magnetic field lines are not local; current returns can be complex • Local calculation is easy: no strong geometry dependence; analytic formulae work very well • Intuitions for design • Seesaw effect between inductance and capacitance • Minimize variations in L and C rather than absolutes • E.g., would techniques used to minimize variation in capacitive coupling also benefit inductive coupling? • Slide courtesy of Sylvester/Shepard
Vin Vout Distributed using multiple lumps of P model of a single wire Interconnect Modeling • Lumped load capacitance • Distributed R(L)C(K) network • P Model for each uniform wire segment • Transmission line • Microwave domain
Transition 80% 50% 20% Vin Vout Delay Distributed using multiple lumps of P model of a single wire Characterization • Signal • Propagation delay • Transition time (slew rate) • Interconnect transfer function H(s) in Laplace domain
Transition Degradation • Transition degradation leads to increased downstream (gate and interconnect) delays Step response of a distributed RC wire as function of location along wire and time Courtesy Prof. A. B. Kahng
Elmore Delay = First Moment of Transfer Function • H(t) = step input response • h(t) = impulse response = dH(t)/dt = transfer function in time domain • T50% = median of h(t) • TED = mean of h(t) • TED = first moment of h(t)
R C h(t) telm t Elmore Delay = Simple Delay Metric • Upper bound 50% delay for RC trees • TED = T50% if symmetric h(t) • TED > T50% for monotonic waveforms • TED T50% with increased transition time • TED = T50% / ln2 for an RC load driven by a step input • +/- 15% error for RC interconnects with a ratio • Simple (linear time) computation • Incremental • facilitate ECO (Engineering Chang Order)
Elmore Delay Computation in an RC Tree Courtesy Prof. A. B. Kahng
Vin Vout Distributed using multiple lumps of P model of a single wire Asymptotic Waveform Evaluation (AWE), etc. • Moment matching poles and residues time domain
uN Iout g1 g2 v1 v2 v3 c1 c2 Interconnect Model Order Reduction or • Direct matrix solver (AWE): numerical instability • Pade via Lancoz (PVL) • Block Arnoldi (PRIMA)
Capacitive Coupling (Crosstalk) • Two coupled lines • Cross-section view • Interwire capacitance allows neighboring wires to interact • Charge injected across Cc results in temporary (in static logic) glitch in voltage from the supply rail at the victim
Aggressor Victim Crosstalk Noise • Glitches caused by capacitive coupling between wires • An “aggressor” wire switches • A “victim” wire is charged or discharged by the coupling capacitance (cf. charge-sharing analysis) • An otherwise quiet victim may look like it has temporarily switched • This is bad if: • The victim is a clock or asynchronous reset • The victim is a signal whose value is being latched at that moment • What are some fixes? • Slide courtesy of Paul Rodman, ReShape
Aggressor Victim Crosstalk Delay Variation: Timing Pull-In • A switching victim is aided (sped up) by coupled charge • This is bad if your path now violates hold time • Fixes include adding delay elements to your path • Slide courtesy of Paul Rodman, ReShape
Aggressor Victim Crosstalk Delay Variation: Timing Push-Out • A switching victim is hindered (slowed down) by coupled charge • This is bad if your path now violates setup time • Fixes include spacing the wires, using strong drivers, … • Slide courtesy of Paul Rodman, ReShape
Delay Uncertainty Delay 85 80 75 70 (%) 65 60 d Noise / T 55 d 50 T D 45 Delay Uncertainty 40 Nominal Delay 35 30 25 0.35 0.30 0.25 0.20 0.15 0.10 Technology Generation (μm) Aggressor Victim Delay Uncertainty • Relatively greater coupling noise due to line dimension scaling • Tighter timing budgets to achieve fast circuit speed (“all paths critical”) • Slide courtesy of Kevin Cao, Berkeley
Input 1 Output 1 Input 2 Output 2 Crosstalk Delay Calculation: Levels of Accuracy • Discard coupling capacitances • De-coupling by replacing coupling caps by double ground caps • De-coupling by Miller factors • Simulating multi-input multi-output (MIMO) networks
Miller Factor • Q = CcvDVv = Cc (DVv – DVA) • Ccv = (DVv – DVA) / DVv * Cc • Miller factorroughly between 0 and 2 • Or between –1 and 3 (for 50% delay calculation)? Courtesy Prof. A. B. Kahng
Input 1 Output 1 Input 2 Output 2 Multi-Input Multi-Output Model • RLC interconnect is linear • Superposition • Each of the drivers is simulated in turn • Other Thevenin voltage sources are shorted • AWE/PRIMA model order reduction techniques
Worst Case Aggressor Scenario • Stimuli vector • For RC interconnects • Aggressors take opposite transition max delay • Aggressors take identical transition min delay • For RLC interconnects • ? • Aggressor alignment • For (linear) interconnects • Aggressors are aligned with each other to make max crosstalk noise peak • Align the noise peak to make max delay variation • For worst case gate delay • ? Aggressor 1 Aggressor 2 alignment Noise D delay
Calculation Flow • Timing window overlaps enable crosstalk delay variation • Chicken-egg dilemma: delay vs. crosstalk • Iteration • Starting with the assumption that all timing windows are overlapped (pessimistic about the unknowns) • Refine calculation by reducing pessimism refinement Aggressor Victim overlap Timing window assumptions D delay Crosstalk delay calculation
A D F B CL CL Gate Timing Characterization • “Extract” exact transistor characteristics from layout • Transistor width, length, junction area and perimeter • Local wire length and inter-wire distance • Device modeling and simulation by BSIM or SPICE (differential-equations solver) Courtesy Prof. A. B. Kahng
Static Timing Analysis • Conservatism (Worst case scenario) • True gate delay depends on input arrival time patterns • STA will assume that only 1 input is switching • Will use worst slope among several inputs • For a number of different input slews and load capacitances simulate the circuit of the cell • Propagation time (e.g., 50% Vdd at input to 50% at output) • Output slew (e.g., 20% Vdd at output to 80% Vdd at output) tslew Vdd tpd Time Courtesy Prof. A. B. Kahng
Look-Up Table • DG = f (CL, Sin) and Sout = f (CL, Sin) • Non-linear • Interpolate between table entries • Polynomial representation vs. lookup tables Load Capacitance Load Capacitance Input Slew Input Slew Output Slew Gate Delay Delay of the gate Resulting waveform
Delay Calculation Cell Fall 0.147ns 0.1ns 0.178 Cell Rise 0.12ns 1.0pf 0.261 Fall delay = 0.178ns Rise delay = 0.261ns Fall transition = 0.147ns Rise transition = … Fall Transition 0.147 Courtesy Prof. A. B. Kahng
Vin Vout Distributed using multiple lumps of P model of a single wire Effective Capacitance • Resistive shielding effect effective capacitance < total load capacitance Iout Tr t • Ceff gate delay
library(my_lib) { delay_model : table_lookup; library_features (report_delay_calculation); time_unit : "1ns"; voltage_unit : "1V"; current_unit : "1mA"; leakage_power_unit : 1uW; capacitive_load_unit(1,pf); pulling_resistance_unit : "1kohm"; nom_voltage : 1.08; nom_temperature : 125.0; nom_process : 1.0; slew_derate_from_library : 0.500000; default_operating_conditions : slow_125_1.08 ; lu_table_template("load") { variable_1 : input_net_transition; variable_2 : total_output_net_capacitance; index_1( "1, 2, 3, 4" ); index_2( "1, 2, 3, 4" ); } cell("INV") { pin(Z) { direction : output; function : "!A"; max_transition : 1.500000; max_capacitance : 5.1139; timing() { related_pin : "A"; cell_rise(load) { index_1( "0.0375, 0.2329, 0.6904, 1.5008" ); index_2( "0.0010, 0.9788, 2.2820, 5.1139" ); values ( \ "0.013211, 0.071051, 0.297500, 0.642340", \ "0.028657, 0.110849, 0.362620, 0.707070", \ "0.053289, 0.165930, 0.496550, 0.860400", \ "0.091041, 0.234440, 0.661840, 1.091700" ); } Timing Library Example (.lib)
PVT (Process, Voltage, Temperature) Derating Actual cell delay = Original delay x KPVT Courtesy Prof. A. B. Kahng
PVT Derating: Example + Min/Typ/Max Triples Proc_var (0.5:1.0:1.3) Voltage (5.5:5.0:4.5) Temperature (0:20:50) KP = 0.80 : 1.00 : 1.30 KV = 0.93 : 1.00 : 1.08 KT = 0.80 : 1.07 : 1.35 KPVT = 0.60 : 1.07 : 1.90 Cell delay = 0.261ns Derated delay = 0.157 : 0.279 : 0.496 {min : typical : max} Courtesy Prof. A. B. Kahng