GTX Update: Framework and New Studies

GTX Update:Framework and New Studies Dirk Stroobandt Ghent University / UCLA

Outline • Introduction • New additions and developments • New studies • Model analysis • Study of the impact of parameters • Design optimization studies • Future plans

GTX User inputs Pre-packaged Knowledge Implementation Parameters (data) Engine (derivation) Rules (models) GUI (presentation) Rule chain (study) GTX: GSRC Technology Extrapolation System • GTX is set up as a framework for technology extrapolation

GTX Knowledge Implementation Parameters (data) User inputs Engine (derivation) Rules (models) GUI (presentation) Pre-packaged Rule chain (study) New Implementation Developments • Engine extensions • Support for name spaces (as in C++) currently being implemented • Allows to attach parameter names to a specific module/study • GUI extensions • Runs on three platforms (Solaris, Windows and Linux) • Batch mode for automatic testing, scripting and macro recording • User support • Software to automatically check parameter naming

GTX GTX Knowledge Implementation Parameters (data) User inputs Engine (derivation) Rules (models) GUI (presentation) Pre-packaged Rule chain (study) New Data and Models • New device and power modules (Synopsys / Berkeley) • New SOI device model (Synopsys / Berkeley) • Inductance models (Silicon Graphics / Berkeley / Synopsys) • Integration of GENESYS in GTX (with help from Georgia Inst. Tech.) • Integration of RIPE in GTX (with help from Rensselaer Univ.) • Yield models (Carnegie-Mellon Univ.)

New Studies: Overview Model analysis • Comparison bulk Si versus SOI • influence on clock frequency and power • parameter sensitivity Design optimization • Wire sizing • Inductance • RC versus RLC models • effect on wire sizing • load capacitance model • wire shielding • Repeater optimization • repeater sizing • placement uncertainty • staggered repeaters • Influence of effective load capacitance • Via parasitics Impact of parameters

S G D Subs Bulk Si Versus SOI Device Models • New device models for bulk Si and Silicon-on-Insulator (SOI) devices • Provided by Dennis Sylvester (Synopsys) and Kevin Cao (UC Berkeley) • SOI model assumes partially-depleted SOI (PD-SOI) technology and is based on popular BSIM3SOI models • Both modules compared to BSIM3 HSPICE runs; results match within 10% • General study • floating body effect in PD-SOI: changes in vth and Idsat • calculate range of possible Idsat values • model ignores the impact of capacitive coupling on body voltage • dynamic delay (due to coupling capacitances between same-layer interconnects)

Bulk Si SOI P (W) % P (W) % Logic + local wires 26.20 46.18 28.99 43.91 Global interconnects 2.20 3.88 2.60 3.93 I/O drivers + pads 11.71 20.65 13.35 20.22 Clock distribution 7.93 13.98 9.65 14.62 Memory 0.94 1.66 0.86 1.31 Short circuit 7.68 13.54 10.21 15.47 Leakage 0.07 0.12 0.36 0.54 Total power 56.74 100.00 66.03 100.00 Bulk Si Versus SOI Device Models (Cont.) • Influence of device technology on clock frequency and power • Best case: largest Idsat (realizable due to floating body effect, only for SOI) and no effective coupling capacitance: f from 1.03 GHz (bulk) to 1.31 GHz (SOI) • Worst case: smallest Idsat and switching factor of 2: 867 MHz and 1.05 GHz • Power results • SOI: 16% increase in power versus Bulk but 24% increase in frequency

Bulk Si Versus SOI Device Models (Cont.) • Parameter sensitivity of both models • Several technology related parameters are varied by +/- 10% • SOI slightly less sensitive to input parameter changes • Process spread (between best-case and worst-case) larger for SOI

R Ctot Ceff R   Interconnect delay  R   Total gate delay  Ctot 5/6 Ctot Ceff 1/6 Ctot Influence of Effective Capacitance • Compare use of Ceff against Ctotfor various gate sizes, wire lengths and wire widths • Ctot leads to 100% overestimation of gate delay • Total gate delay = intrinsic gate delay + gate load delay • Resistive shielding of gate load: effective capacitance! Interconnect delay Intrinsic gate delay Gate load delay

Ctot 5/6 Ctot Ceff 1/6 Ctot Influence of Effective Capacitance (Cont.) • Gate delay increases with increasing line width for Ceff because the resistive shielding effect weakens

Repeater Optimization • Most commonly cited optimal buffer sizing expression (Bakoglu) • In GTX: • Sweep repeater size for single stage in the chain • Examine both delay and energy-delay product 2.4 Lseg = 2.14 mm W=S=1mm 6 2.2 W=S=0.5mm 2.0 5 1.8 Critical Path Delay (ns) 4 1.6 Normalized Energy-Delay Product 1.4 3 1.2 2 1.0 0.8 1 0 100 200 300 400 500 Bakoglu Repeater Size (X min size) optimal sizing

1.4 1.8 SF = 1 SF = 2 1.7 SF = 3 1.3 1.6 1.5 1.2 1.4 Normalized Delay Normalized Peak Noise 1.3 1.1 1.2 1.1 1.0 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Repeater Placement Uncertainty e Repeater Optimization (Cont.) • Repeater placement uncertainty • Large amount of repeaters; are placed in repeater islands [Cong et al, ICCAD99] • Segment length becomes “uncertain”: • Half of segments will be overdriven, half will be underdriven

0.30 1.0 Switch Factor = 2 0.28 0.9 Switch Factor = 3 SOI (NS) 0.26 0.8 0.24 0.7 0.22 0.6 Reduction in Peak Noise (V) 0.20 bulk (NS) 0.5 Normalized Delay Uncertainty 0.18 10% of Vdd 0.4 SOI (S) 0.16 0.3 0.14 0.2 0.12 bulk (S) 0.1 0.10 0.08 0.0 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Interconnect Spacing (mm) Repeater Optimization (Cont.) • Staggered repeaters • First introduced in [Kahng et al, VLSI Design 99] to reduce delay and noise

Inductance analysis • Five different models implemented in GTX • Bakoglu’s model (RC_B) • [Alpert, Devgan and Kashyap, ISPD 2000] (RC_ADK) • [Ismail, Friedman and Neves, TCAD 19(1), 2000] (RLC_IFN) • [Kahng and Muddu, TCAD 1997] (RLC_KM) • Extension of [Alpert, Devgan and Kashyap, ISPD 2000] (RLC_ADK)

k = 100 k = 50 2.2 2.2 [Cong & Pan 99] [Cong & Pan 99] 2.0 2.0 RC, 1 pole RC, 1 pole RLC RLC 1.8 1.8 Repeater size = 100 X min 1.6 1.6 1.4 1.4 Optimal Wire Width (mm) 1.2 1.2 1.0 1.0 Repeater size = 50 X min 0.8 0.8 0.6 0.6 0.4 0.4 0 2 4 6 8 10 Line Length (mm) Inductance analysis (Cont.) • Effect of RLC line delay models on wire sizing • [Cong and Pan, DAC 99]: • Optimal line width found by sweeping width in GTX for RC and RLC

400 (Ceff + RLC_KM) models 350 (Ctot + RLC_KM) models 300 Stage Delay (ps) 250 200 150 100 3.5 4.5 5.5 6.5 7.5 8.5 9.5 Wire Length (mm) Inductance analysis (Cont.) • Effect of Ceff for two-pole RLC line delay model

NS 3.45 5.75 8.75 RC, 1 pole 1S 5.55 7.40 9.25 2S 7.65 7.65 7.65 NS 3.45 6.25 9.00 RC, 2 pole 1S 5.55 7.40 9.25 2S 7.65 7.65 7.65 Model Shield SF=1 SF=2 SF=3 NS 2.85 4.60 6.75 RLC 1S 5.10 6.00 7.40 2S 7.05 7.05 7.05 Inductance analysis (Cont.) • Effect of shielding • Optimize cost function = wire pitch x repeater size x number of repeaters • Parameters: • width of shield and signal wires • spacing between signal wires and from signal wires to shield wires • Constraints: • delay (max. 1 ns) • delay uncertainty (difference RC 2-pole and RLC delays) • noise peak (20% of Vdd) • max. slew rate at rep. inputs = 0.5 ns • Topologies: • No shielding (NS) • 1 shield (1S) • 2 shields (2S)

130 200 width=1.0um 120 190 width=1.0mm width=1.5um 110 180 width=1.5mm 100 170 90 160 Area cost for 16-bit signal lines (mm) 80 150 Stage Delay (ps) Ctot model for gate delay 70 140 60 130 50 120 Ceff model for gate delay 40 110 30 100 0 5 10 15 0 5 10 15 Number of signal lines between shield Number of signal lines between shield Inductance analysis (Cont.) • Effect of shielding and Ceff versus Ctot (with two-pole RLC delay model)

Benefits of using GTX for studies • Easy to add new models to the GTX framework • Bulk device, power, SOI models • Reuse of models already present • BACPAC cycle time model reused • Sweeping ability allows assessment of effects of changing parameters • Influence of effective capacitance (sweep wire length, width etc.) • Easy rule substitution allows comparison between models • RC versus RLC models • Constraints allow elimination of infeasible solutions • Shielding study used delay and noise constraints

Future extensions to the GTX engine • Annotated parameters (e.g., t_cycle_local<1>) • Example: • old rule computes the local cycle time “t_cycle_local” • new rule adds clock skew: t_cycle_local<2> = t_cycle_local<1> + t_skew • Benefits: • Allows parameter updating without changing previous rules or renaming old parameters (not annotated = <0> annotation + annotation collapsing) • No need for different names for different versions of the same parameter • Allows the same parameter as input and output to the same rule • Iteration (possibly benefits from annotations) • Several rules are combined in a single iteration loop

Future studies in GTX • Variability studies • Cost and yield • Reliability • … (suggestions welcome)

Conclusions • New work on GTX mainly on new studies • Bulk Si versus SOI • Influence of effective capacitance • Repeater optimization studies • Inductance analysis • Inclusion of more models will allow exponential growth in new studies possible in GTX • We keep searching for experts in the field willing to provide new models for GTX

GTX Update: Framework and New Studies