1 / 51

ITRS-2001 Grenoble Meeting April 25, 2001 U.S. Design TWG

ITRS-2001 Grenoble Meeting April 25, 2001 U.S. Design TWG. 4/26 Agenda and Issues. ASIC-LP  SOC (product class driven by cost and power) SOC = system driver class, is roadmapped in the System Drivers Chapter, would like to keep correlation

aleda
Download Presentation

ITRS-2001 Grenoble Meeting April 25, 2001 U.S. Design TWG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ITRS-2001 Grenoble MeetingApril 25, 2001U.S. Design TWG

  2. 4/26 Agenda and Issues • ASIC-LP •  SOC (product class driven by cost and power) • SOC = system driver class, is roadmapped in the System Drivers Chapter, would like to keep correlation • Participation of other TWGs in the System Drivers Chapter • Design ITWG: (1) ORTC Lines (freq, density, power, chip size / logic-memory tx counts, pkg pins/balls); (2) System Drivers Chapter (MPU, SOC, AMS/RF, DRAM); (3) STRJ models; (4) AMS/RF models

  3. SYSTEM DRIVERS Chapter • Defines segments of silicon market that drive process and design technology • Along with ORTCs, serves as “glue” for ITRS • 4 Drivers: SOC (Japan), MPU (USA), DRAM (Korea), M/S (Europe) • SOC: driven by cost, power, integration • SOC: same as “ASIC-LP”, drives device requirements, packaging IO counts, • M/S: driven by applications in networking/telecomm • Each section: • Formal definition of this driver • Nature, market, past/present/future • What market forces apply to this driver ? • For what factors (process, device, design technology) is this a driver ? • Key figures of merit, and futures • Participation of other ITWGs

  4. DESIGN Chapter • Context • Scope of Design Technology • High-level summary of complexities (at level of “issues”) • Cost, productivity, quality, and other metrics of Design Technology • Overview of Needs • Driver classes and associated emphases {SOC, MPU, DRAM, MS} • Resulting needs (e.g., power, …, cost-driven design) • Summary of Difficult Challenges • Detailed Statements of Needs, Potential Solutions • System-Level, Circuit, Logic/Physical, Verification, Test

  5. 4/27 DESIGN Presentation Outline • Mixed-Signal Roadmap summary slide • Low-Power Scenario slide • Clock Frequency Model slide • SRAM Density model slide • Decreasing Memory Content slide • Design Cost (and Quality) Requirement slide • System Drivers Chapter slide • Design Chapter slide

  6. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  7. MPU Diminishing Returns • Pollack’s Rule • In a given process technology, new uArch takes 2-3x area of old (last generation) uArch, and provides only 40% more performance (see Slide) • Slide: process generations (x-axis) versus (1) ratio of Area of New/Old uArch, (2) ratio of Performance of New/Old (approaching 1) • Slides: SPECint, SPECfp per MHz, SPECint per Watt all decreasing rapidly • Power knob running out • Speed == Power • 10W/cm2 limit for convection cooling, 50W/cm2 limit for forced-air cooling • Large currents, large power surges on wakeup • Cf. 140A supply current, 150W total power at 1.2V Vdd for EV8 (Compaq) • Speed knob running out • Historically, 2x clock frequency every process generation • 1.4x from device scaling (running into t_ox, other limits?) • 1.4x from fewer logic stages (from 40-100 down to around 14 FO4 INV delays) • Clocks cannot be generated with period < 6-8 FO4 INV delays • Pipelining overhead (1-1.5 FO4 INV delay for pulse-mode latch, 2-3 for FF) • Around 14 FO4 INV delays is limit for clock period (L1 $ access, 64b add) • Unrealistic to continue 2x frequency trend in ITRS

  8. Performance Efficiency of Microarchitectures – Pollack’s Rule Area (Lead / Compaction) Growth (X) Performance (Lead / Compaction) Note: Performance measured using SpecINT and SpecFP 1.5 1 0.7 0.5 0.35 0.18 Technology Generation • Implications (in the same technology) • New microarchitecture ~2-3X die area of the last microarchitecture • Provides 1.4-1.7X performance of the last microarchitecture We are on the Wrong Side of a Square Law Intel: Gelsinger talk ISSCC-2001

  9. Decreasing SPECint/MHz

  10. Decreasing SPECfp/MHz

  11. Decreasing SPECfp/Watt

  12. Advanced Micro Devices Alpha Processor BULL S.A. Compaq Computer Data General Corp. Dell Computer Digital Equipment Fujitsu Gateway 2000 HAL Computer Systems Hewlett-Packard Hitachi Ltd. IBM Intel Intergraph Corp. KryoTech Motorola Pyramid Technology ROSS Technology SGI Siemens Sun Microsystems Tandem Computers UNISYS Corp. Addendum: SPEC Company List (www.specbench.org)

  13. MPU Clock Frequency Trend Intel: Borkar/Parkhurst

  14. MPU Clock Cycle Trend (FO4 Delays) Intel: Borkar/Parkhurst

  15. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  16. New MPU Clock Model • Global clock: flat at 14 FO4 INV delays • FO4 INV delay = delay of an inverter driving a load equal to 4 times its input capacitance • no local interconnect: negligible, scales with device performance • no (buffered) global interconnect: (1) was unrealistically fast in Fisher98 (ITRS99) model, and (2) global interconnects are pipelined (clock frequency is set by time needed to complete local computation loops, not time for global communication - cf. Pentium-4 and Alpha-21264) • Local clock: flat at 6 FO4 INV delays • somewhat meaningless: only for ser-par conversion, small iterative structures, “marketing interpretation” of phase-pipelining • reasonable alternative: delete from Roadmap • ASIC/SOC: flat at 40-50 FO4 INV delays • absence of interconnect component justified by same pipelining argument, and by convergence of ASIC / structured-custom design methodologies, tools sets • higher ASIC/SOC frequencies possible, but represent tradeoffs with design cost, power, other figures of merit • information content is nil; reasonable to delete from Roadmap

  17. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  18. MPU Futures (1) • Drivers: power, I/O bandwidth, yield, ... • Multiple small cores per die, more memory hierarchy on board • core can be reused across multiple apps/configs • replication  redundancy, power savings (lower freq, Vdd while maintaining throughput); better use of area than memory; avoid overhead of time-mplexing • IBM Power4 (2 CPU + L2); IBM S390 (14 MPU, 16MB L2 (8 chips) on 1 MCM (31 chips, 1000W, 1.4B xtors, 4224 pins)) • Processor-in-Memory (PIM): O(10M) xtors logic per core • 0.5Gb eDRAM L3 by 2005 • high memory content gives better control of leakage, total chip power • I/O bandwidth major differentiator • double-clocking, phase-pipelining in par/ser data conversion hits 6 FO4 limit • I/O count may stay same or decrease due to integration • roughly constant die size (200-350 mm2) also limits I/O count • Evolutionary uArch changes • superpipelining (for freq), superscalar (beyond 4-way) running out of steam • more multithreading support for parallel processing • more complex hardwired functions (networking, graphics, communications, ...) (megatrend: shift of flexibility-efficiency tradeoff point away from GPP)

  19. MPU Futures (2) • Circuit design • ECC for SEU • pass gates on the way out due to low Vt • more redundancy to compensate for yield loss • density models are impacted • Clocking and power (let’s be reasonable about “needs” !) • 1V supplies, 10-50W total power both flat • SOI (5% or 25%), multi-Vth (10%), multi-Vdd (30-50%), min-energy sizing under throughput constraints (25%), parallelism … (synergy not guaranteed) • multiple clock domains, grids; more gating/scheduling • adaptive voltage and frequency scaling • frequency: +1 GHz/year ... BUT: marketing focus shifts to system throughput • Bifurcation of MPU requirements via “centralized processing”? • smart interface remedial processing (SIRP): basic computing and power efficiency, SOC integration of RF, M/S, digital (wireless mobile multimedia) • centralized computing server: high-performance computing (traditional MPU) • The preceding gives example content for definition of MPU (high-volume custom) in System Drivers Chapter

  20. ASIC-SOC-MPU Convergence • Custom vs. ASIC headroom diminishing • density of custom == 1.25x ASIC (logic, memory) • “custom quality on ASIC schedule” achieved by on-the-fly, tuning, liquid etc. cell-based methodologies (cf. IBM, Motorola) • convergence of ASIC, structured-custom methodologies (accelerated by COT model, tool limitations) to “hierarchical ASIC/SOC” • ASIC-SOC convergence • ASIC = business model • SOC = product class (like MPU, DRAM), driven by cost and integration • ASICs are rapidly becoming indistinguishable from SOCs in terms of content, design methodology • MPU-SOC convergence • MPUs evolving into SOCs in two ways • MPUs designed as cores to be included in SOCs • MPUs themselves designed as SOCs to improve reuse • (recall also SIRP = SOC integration) • Thus, four System Driver Classes: MPU (high-volume custom), SOC, DRAM, AMS/RF

  21. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  22. ASIC Logic Density Model • Average size of gate (4t) = 32MP2 = 320F2 • MP is contacted lower-level metal pitch • sets size of a standard cell (e.g., 7-track, 9-track, etc.) • ITRS Interconnect chapter: MP ~ 3.1-3.2 * F  1 MP2 = 10F2 (consistent throughout technologies) • 32 comes from: • 8 tracks (expected height for dense std-cell library) by 4 tracks (avg width of 2-input NAND gate) • close match with claimed gate densities (published and unpublished data) – e.g., 100K gates/mm2 at 0.18mm • Overhead/white space factor = 0.5 • effective gate size = 64MP2 • logic density = 19.3Mt/cm2 at 180nm (compare to 20Mt/cm2 in ITRS2000, total density) • Scales quadratically • e.g., density 1.39Bt/cm2 at 30nm will be 36X that at 180nm (compare with current ITRS)

  23. MPU Logic Density Model • Custom logic density == 1.25X ASIC logic density • Example: MPU logic density 24.13Mt/cm2 at 180nm (equal to 60K gates/mm2) • Suggest breaking out logic and SRAM density separately for MPU, rather than lumping together

  24. SRAM Density • SRAM cell size expressed as A*F2 • SRAM A factor essentially constant, barring paradigm shifts in architecture/stacking • Slight reduction with scaling, as seen in following slide • N.B.: 1-T SRAM (www.mosys.com): 2-3x area reduction, 4x power reduction, in production (Broadcom, Nintendo) • Overhead (periphery) • Best current estimate = 100%  effective bitcell size = 2*actual • Periphery area can be more exact function of memory size • smaller caches experience more overhead (could pertain to cost-perf vs. high-perf MPUs) • A word * B bit SRAM: core area = A*B*C (Artisan TSMC25: C = 240 F2); periphery area = K*log(A)*B (Artisan TSMC25: K = 4000-5000 F2)

  25. Collection of 6T SRAM Cell Sizes from TSMC, Toshiba, Motorola, IBM, UMC, Samsung, Fujitsu, Intel 200 180 160 ) 2 140 120 A-Factor (SRAM Cell Area 100 normalized to F 80 60 A-Factor= 50.546F + 133.19 40 20 0 0.1 0.15 0.2 0.25 0.3 0.35 0.4 F (DRAM half-pitch) micron Without overhead

  26. SRAM Density • At 180nm  65.2 Mt/cm2 (compare to 35 Mt/cm2 in ITRS00 for cost-performance MPU) • Easier to understand: 10.87 Mbits/cm2 since the Mt/cm2 definition ignores peripheral transistor count • At 30nm  414.6 Mbits/cm2 or 2.49 Bt/cm2 (compare to 3.5Bt/cm2 in ITRS00) • Difference is due to non-quadratic scaling in ITRS00

  27. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  28. Memory/Logic Power Study Setup • Motivation: Is current ITRS MPU model consistent with power realities? Does it drive the right set of needs? • Ptotal = Plogic + Pmemory = constant (say, 50W or 100W) • Plogic composed of dynamic and static power, calculated as densities • Pmemory = 0.1*Pdensity_dynamic • power density in memories is around 1/10th that of logic • Logic power density (dynamic) determined using active capacitance density (Borkar, Micro99) • dynamic power density Pdensity_dynamic = Cactive * Vdd2 * fclock • fclock uses new fixed-FO4 inverter delay model (linear, not superlinear, with scale factor) • Cactive = 0.25nF/mm2 at 180nm • increases with scale factor (~1.43X)

  29. Memory/Logic Power Study Setup • Static power model considers dual Vth values • 90% of logic gates use high-Vth with Ioff from PIDS Table 28a/b • 10% of logic gates use low-Vth with Ioff = 10X Ioff from PIDS Table 28a/b (90/10 split is from IBM and other existing dual-Vth MPUs) • Operating temp (80-100C)  Ioff is 10X of Table 28a/b (room temp) • Width of each gate determined from IBM SA-27E library • 150nm technology; 2-input NAND = basic cell • performance level E: smallest footprint, next to fastest implementation  W of each device ~ 4um • Weff (effective leakage width) for each gate = 4um • 0.8*Weff*Ioff (per um) = Ileak / gate (0.8 comes from avg leakage over input patterns)

  30. Memory/Logic Study Setup • Calculate densities, then find allowable logic component (percent of total area) to achieve constant power (or power density) • Amemory + Alogic = Achip • recall that Achip is flat at 157 mm2 from 1999-2004, then increases by 20% every 4 years • Constant power and constant power density scenarios same until 65nm node (because chip area flat until then)

  31. Power as a Constraint: Implications Constant power or power density  decreasing logic content cannot scale logic, SRAM in lock step as in current ITRS Anomaly going from 45nm to 32nm due to constant Vdd

  32. Power as a Constraint: Implications Using same constraints, calculate #MPU cores (12Mt/core) and Mbytes SRAM allowable (again, anomaly at 32nm due to constant Vdd)

  33. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  34. Design Cost Requirement • “Largest possible ASIC” design cost model • engineer cost per year increases 5% per year ($181,568 in 1990) • EDA tool cost per year increases 3.9% per year ($99,301 in 1990) • #Gates in largest ASIC design per ORTCs (.25M in 1990, 250M in 2005) • %Logic Gates constant at 70% (see next slide) • #Engineers / Million Logic Gates decreasing from 250 in 1990 to 5 in 2005 • Productivity due to 7 Design Technology innovations (3.5 of which are still unavailable) : RTL methodology; In-house P&R; Tall-thin engineer; Small-block reuse; Large-block reuse; IC implementation suite; Intelligent testbench; ES-level methodology • Small refinements: (1) whether 30% memory content is fixed; (2) modeling increased amount of large-block reuse (not just the ability to do large-block reuse). No discussion of other design NRE (mask cost, etc.). • #Engineers per ASIC design still rising (44 in 1990 to 875 in 2005), despite assumed 50x improvement in designer productivity • New Design Technology -- beyond anything currently contemplated -- is required to keep costs manageable

  35. Design Cost Requirement • Source: Dataquest (2001)

  36. ASIC Memory Content Trends • Source: Dataquest (2001)

  37. Design Quality Requirement • “Normalized transistor” quality model • speed, power, density in a given technology • analog vs. digital • custom vs. semi-custom vs. generated • first-silicon success • other: simple / complex clocking, … • developing quality normalization model within MARCO GSRC; VSIA, Numetrics, others pursuing similar goals • Design quality: gathering evidence, will have metric, historical trend / needs table) • Design quality, and quality/cost, will show red bricks?

  38. Outline • MPU diminishing returns • New MPU clock frequency model • MPU futures and ASIC/MPU/SOC convergence • New logic (ASIC, MPU) and SRAM density models • Required logic decrease due to power constraint • Design cost / design quality requirement, gap analysis • Summary of changes and errata (ORTCs, other TWGs)

  39. Pre-Meeting Design Changes (1/2) • New clock frequency requirements • FO4 based, no global interconnect • global clock tracks 14 FO4 INV delays • local clock tracks 6 FO4 INV delays (or can be deleted) • New layout density requirements • “A” factors for SRAM, logic (custom), logic (semi-custom) • adjustments for overheads (memories) • adjustments for redundancy, error correction • adjustments for change in “MPU” architecture (multi-core, L3 on board, ...) • New MPU power requirements • bring total chip power down (e.g., flat at 90W, or perhaps 50W) • socially responsible, reasonable “need”, if nothing else • New MPU figures of merit and requirements • statement of need: increase utility (SPECint, throughput, etc.), not frequency • server: SPEC/W, I/O or request handling bandwidth • smart interface: power, form factor, reusability, reprogrammability

  40. Pre-Meeting Design Changes (2/2) • #Metal layers • formal model: #Layers grows as log (#Transistors) (DeHon 2000) • add: dedicated metal layers for inductive shielding (1 per generation; these are not “interconnect” layers) • Package pins/balls • Variability • performance uncertainty due to variation of Leff, Vt, Tox, W_int, t_ILD, etc. is managed by design of synchronization, logic, circuits • these tolerances can be increased (removing some red bricks), and in any case should be developed via critical-path, other design models • ASIC-SOC convergence • SOC (= System-LSI) is the “product class” that is analogous to MPU, DRAM • System Drivers Chapter: SOC, MPU, AMS/RF, (DRAM) • references to “ASIC” in ITRS should be adjusted/removed accordingly

  41. Design Pre-Meeting Notes (ORTCs 1/1) • ORTCs: inconsistent density metrics • ASIC, high-perf MPU give total density; cost-perf MPU breaks down logic vs. memory • ORTCs: density scales super-quadratically • 180nm to 90nm gives 5X rise in density (instead of 4X) • 180nm to 30nm gives 100X rise in density (instead of 36X) • ORTCs: ASIC total density == high-perf MPU total density • MPU logic density should be 1.25X ASIC • even if SRAM densities same, overall MPU density should be >5% larger (more if ASIC memory component is smaller than MPU) • ORTC00: MPU pad counts, Tables 3a/3b • flat from 2001-2005 • but in this time period, chip current draw increases 64% • ORTCs: Distinguish b/w high-perf MPU, ASIC power? • currently no estimates for high-end ASIC power consumption

  42. Post-Meeting Notes (PIDS 1/2) • AR for DESIGN: Obtain MPU designers’ input • AR for PIDS: contribute to the common LITHO-PIDS-FEP-DESIGN linked spreadsheet • AR for DESIGN and PIDS: Review device specs and characteristics in terms of circuit implications and requirements • Rds (> 10% of Ron  nearing 22-25%) • Rho_gate (sheet resistance spec on gate lead, increasing from previous 4-6 ohm values) • Ion, Ioff values (Ion > 2.5mA/um ? Ioff = 1 uA/um at 65nm node (room temperature)) • Delta Lg, Delta Vt (25mV at 65nm node?), etc. • AR for PIDS: Answer DESIGN questions/concerns re key device characteristics and statistical variations, especially variations that define Delta Lg – goal is to have a common understanding of variability requirements • AR for DESIGN: give a power dissipation spec (tunneling, Ioff) • AR for DESIGN: Consider both high performance and low power scenarios • AR for PIDS: Pass along current work on ASIC-LP roadmap to Hiwatashi-san of Japan Design TWG; this will lead to SOC input within the System Drivers Chapter

  43. Post-Meeting Notes (PIDS 2/2) • Original list of comments on PIDS Chapter that DESIGN ITWG had before meeting (we did not get to discuss most of these…) • Rds #’s in spreadsheet from PIDS are higher than 15-20% numbers – more like 22-25% near end of ITRS. Drive current penalty is < 14%. Necessary Vth reduction to compensate will adversely affect Ioff. Since Ioff is already a major headache this might not be good. • What is “effective Vth”? • Very strong dependence when in tunneling regime: leakage increases 10X with 2 angstrom differences in oxide thickness • Ioff values seem much worse than in the current ITRS: 1 uA/um at 65nm (is this room temp or 100C? Answer from PIDS: room temp). This is listed as “user-adjustable parameter”. (Answer from PIDS: higher Vth means that it’s less temp-dependent.) • From DESIGN perspective, these values are problematic. • From DAC-01 work, also seem pessimistic, particularly if assuming negligible gate depletion and inversion layer quantization effects, Cox scaling should help alleviate Vth reductions which are otherwise the primary way of getting Ion to be 750 uA/um. • 10% static power constraint in Table 28a/b • Justification? • 100X increase in Ioff from room temp to 100C is high (better estimate = 10-20X; see Borkar IEEE Micro99) • W/L of 3 for all devices – including memory? Max Ioff used? Pessimistic; should use simpler gate-level (not xtor-level) approach • Suggest that PIDS be as clear as possible re both electrical and physical gate oxide thickness (Answer from PIDS: already in the table) • can incorporate expected gate material enhancements to reduce gate depletion effects (GDE) • can give better depiction of how Ion scales and how significant Ioff will be as a result • DESIGN suggestion: optimize over all possibilities by exhaustive search.

  44. Post-Meeting Notes (FEP 1/1) • AR for FEP (with PIDS, Litho): close on definitions of CD variation • AR for FEP, DESIGN: collaborate with PIDS, Litho to develop a big spreadsheet to call out interdependencies between these four TWGs as much as possible • AR for FEP: explain to DESIGN ITWG the models or derivations that are behind the following variability specs: gate oxide, gate length, effective channel length, CD bias iso-dense • Comment from FEP TWG: etch bias does not include every density, every feature size (comprehended by RET in the mask) – rather, just one single isolated line • Comment from FEP TWG: companies will start with larger printed feature size, larger etch bias to get down to given physical gate length  need more control (what is a model for this trend?) • AR for FEP: explain to DESIGN ITWG the critical area definition/model used • AR for DESIGN: develop implications of low-power circuit techniques on FEP technology requirements • must address both low operational power, low standby power regimes • what are leakage requirements for low-power? • N.B.: gate leakage (need high-k) and subthreshold leakage (Vth) are at roughly same order of magnitude. Also, leakage requirement is now getting to 100A/cm^2

  45. Post-Meeting Notes (M&S 1/1) • AR for M&S: provide model of SEU phenomena  send to DESIGN folks • affects need for ECC, SRAM layout density, etc. • Note: ABK did not take notes at this meeting • mostly, was a freeform discussion with no obvious AR’s for either TWG at the conclusion (?) • Some questions that came up for M&S from DESIGN: • best practices : best-in-class analysis and simulation approximations (e.g., power, timing) to fit various CPU, information regimes • compact modeling issues (4-terminal devices, SiGe, …) • modeling of critical performance indicators (leakage, transconductance, max frequency of CMOS, Ion/Ioff ratios, etc.) • Implicit AR for DESIGN: make some requests (?) • variability modeling • timing and power models (e.g., crosstalk delay uncertainty model) • AR for M&S: Can M&S ITWG come up with optimizable models (i.e., suitable as objectives for optimization)? • AR for M&S: In addition to purely modeling and simulation, can M&S ITWG come up with silicon calibration methodologies, eyecharts, etc. to provide the validation and test structures that accompany models and simulations? • Were there any AR’s for DESIGN ?

  46. Post-Meeting Notes (A&P 1/1) • AR for DESIGN (from A&P): Address codesign of die and package: RF, passives, … • 3 domains: EM, thermal-mech, microstructural (stress/strain)  at minimum, need to pass datasets back and forth (e.g., design of RF front-end w/flip-chip • Differential heating (transient, not structural) + mechanical stresses in flip-chip  die is the weak link (microstructure, fracturing, …) • (What about redistribution, terminal assignment, etc. – and what about existing tool development in industry – cf. Cadence-Agere announcement?) • A&P Goals: (1) short term: data; (2) medium-term: codesign; long-term: cost-driven die-package co-optimization? (May have written this down wrong…) • AR for A&P: Send new writeup on multi-die packaging to DESIGN folks • AR for A&P: Send new writeup on SOC / optoelectronics end of spectrum to DESIGN folks • A&P cited four areas for roadmapping: MEMS, MCM, materials, optoelectronics • Pre-Meeting Questions from DESIGN to A&P (we did not get to discuss most of these) • Effective bump pitch roughly constant at 350um throughout ITRS • Why does bump/pad count scale with chip area only, not with technology demands (IR drop, L*di/dt) ? • Implication – metal resource needed to ensure <10% IR drop skyrockets since Ichip and wiring resistance increase • Later technologies (30-40nm) have too few bumps to carry required maximum current draw • 1250 Vdd pads at 30nm: with bump pitch of 250mm can carry 150mA (bumps at 350nm can carry more, not shown in ITRS) • 187.5A max capability but Ichip/Vdd > 300A • 100,000 hour reliability #’s build cushion into this calculation, but… could A&P provide details of analysis? • Why is hand-held power 2.6W in 2005 (monotonically increasing 1999-2005) but then 2.1W in 2008 (resumes increasing)?

  47. Post-Meeting Notes (Interconnect 1/1) • GLOBAL INTERCONNECT STUDY GROUP: perhaps should not expect activity this renewal cycle • AR for INTERCONNECT: Linked spreadsheet  send to DESIGN folks • AR for INTERCONNECT: “Algorithm” for creating idealized interconnect stack  send to both ITWGs • AR for Werner Weber (DESIGN): Distribute the IBM paper he referred to (re optimal interconnect stacks) • AR for DESIGN: Variability control requirement  send to INTERCONNECT folks • planarization, width/spacing, … issues are all subsumed within this AR • Example metric: “cross-sectional variance per unit length”… • Note: INTERCONNECT ITWG would like a more well-supported/motivated planarization metric… (due to time considerations, INTERCONNECT ITWG was not intending to address variability in detail in this renewal cycle) • AR for DESIGN: (Hiwatashi-san) Work with Japan TWG (Ohsaki-san) to understand the interconnect stack performance requirement issue • Is the position of the Japan TWG that both #levels and “minimum dimensions” be increased? If so, what is the motivation/thinking behind this? (Initial reaction: increasing #levels is probably justifiable; increasing minimum dimensions is not as necessary since designers have non-minimum dimensions available to them, and since global interconnect performance is solvable by pipelining, etc. • AR for Chris Case: broadcast the Japan TWG slides (Ohsaki-san’s slides) • AR for DESIGN: address the question of #Metal levels (should they increase?) by proposing a model  send to INTERCONNECT folks • AR for DESIGN: Comment on NEED for particular effective ILD permittivities, particularly red brick values • At least, will create a dialogue about any differences in “requirements” • AR for INTERCONNECT: Explain CEP and any potential changes for Design • e.g., how does need for dummy features change? • AR for DESIGN: How should Interconnect parameters be driven by design? • Example hope: “Crosstalk metric” (e.g., DRAM makers must decide when to invest …) • Potential criteria for metrics: Delay uncertainty? Power delivery? Via impact factor / porosity / routability • AR for DESIGN: Variability requirement for on-chip passives (e.g., driven by Q variability needs) • motivation: thick metal planarization (e.g., 5um high spiral inductors) • AR for DESIGN/INTERCONNECT: Work out whether PSM of LI/M1 layers needed • Check spreadsheet to determine whether pitches vs. wavelengths for use of PSM • AR for DESIGN/INTERCONNECT: work with LITHO, PIDS, FEP to build common big spreadsheet

  48. Post-Meeting Notes (Litho 1/2) • AR for LITHO: Can cost of mask be added into the ITRS? • As a “precedent”, we would cite the existence of a “Data Volume” requirement in Table 41. • Independent of this issue, can Litho TWG help us gather this information? • AR for LITHO: Is local interconnect going to use strong phase-shifting? If so, when? • AR for LITHO: Can the CD variability requirement (Table 39) be relaxed based on input from DESIGN ITWG? • Design also joins in the apparently existing request for a more precise definition of CD control (see below). • What should the process be if Design and Litho end up with different requirements for CD control? • AR for LITHO: Continuing with the variability discussion, could the LITHO ITWG supply DESIGN ITWG with explanations of WHY particular requirements exist? • These requirements, in our opinion, must exist to achieve some target level of control over the manufactured features. (If we are wrong in this understanding, please tell us.) • Design needs to know the relationship between the Litho requirement and the eventual effect in terms of control over functional aspects of the manufactured devices and interconnects. As examples, we list the following. • Table 41: wafer overlay, magnification, mask minimum image size, mask OPC feature size, image placement, mask design grid, etc. (One observation is that Litho ITWG defines a mask OPC feature size requirement, yet is unable to tell Design ITWG the underlying assumptions as to what the TYPE of OPC might be!) • Table 41 (continued): AttPSM transmission mean deviation from target (e.g., why is this mean only?), AttPSM transmission uniformity (don’t these transmission deviation/uniformity requirements depend on the nominal values of transmission?) • AR for LITHO: Please tell us what needs to be improved (i.e., by DESIGN Technology) in the design-litho flow. • For example, does the mask cost structure require function- and cost-driven mask data prep / RET? • As suggested above, we are confused by Litho ITWG’s inability to provide guidance re efficiency and cost issues, given that there are data volume requirements, OPC minimum feature size, etc. requirements in Table 41. It seems to us that data volume requirements (for example) are completely driven by throughput, storage, etc. – i.e., cost – considerations. • With respect to the design-litho flow, what information would LITHO like to see from DESIGN?

  49. Post-Meeting Notes (Litho 2/2) • AR for LITHO: Please supply formal (exact) definitions of the variabilities that are coming into design. (Some of the following may overlap with previous AR statements.) • Are variabilities corrected or uncorrected? • What correction mechanisms are assumed? (It matters GREATLY to Design whether these are using OAI, annular/quadrupole illumination, particular OPC technologies (SRAFs, level of OPC aggressiveness), particular PSM technologies (AltPSM (full-poly? LI layer?), AttPSM), etc. • Specify precisely where in the reticle these variabilities are occurring. Design requires knowledge of the decomposition of variability into systematic and random components; we need to know what is correctable in the mask data prep flow (RET), what is correctable in Design (e.g., compensating for coma effect by assuming that “lower-quality” features will print from the periphery of the reticle), and what is uncorrectable. Put another way, the agreement during the discussion was that Litho would break out variability into systematic / random and cross-chip / local. • AR for LITHO: Please define (formally) OPC and PSM for us. • The level of detail that we seek is suggested in the previous AR. • AR for LITHO: Please explain the model/derivation of magnification (Table 41) • AR for DESIGN: Can large stitching tolerances be allowed in the mask-making process? The EPL tool requires such stitching every M microns (M = 250 or 8000? Not clear from discussion). This would apparently yield a new problem or constraint for layout and floorplanning. • AR for DESIGN: Are the presently specified overlay tolerances acceptable? • LITHO comments during discussion: (1) Lithography sets the pace of the roadmap; (2) cost does not come into play in Litho’s roadmapping (???); (3) many answers to the questions from Design cannot be shared in an open forum (but, why is Litho discussing SRAF metrology requirements with Metrology while at the same time being unable to provide a one-bit (yes/no) indication to Design as to whether SRAFs should be assumed in future masks?); (4) CD variability requirements may vary by level. • Comment re Data Volume and AR for LITHO: Please indicate the model for setting the “requirement” • Members of Design ITWG have studied this issue (including GDSII Stream, MEBES etc. formats), and are curious how the numbers were derived.

  50. Post-Meeting Notes (Test 1/2) • AR for DESIGN: statistically aware timing/xtalk analysis: is this good enough for signoff, or does test have to be a backstop for this / clean up ? • AR for DESIGN: model for #logic, #memory transistors on-chip  send to TEST folks • AR for DESIGN: what percentage of the memory arrays have BIST (can avoid having APGs on test equipment) • AR for DESIGN: send to TEST folks all information re normal operation power dissipation, frequencies, #IOs, … for all System Drivers (I.e., type of blocks we have) – highest frequency pins (split p/g vs. signal pins) (IEEE 1394), SDRAM/DDR/RDRAM speeds, pin inductance, … • AR for DESIGN: what kind of design debug environment is envisaged? (first-silicon design debug) • AR for TEST: describe the boundary between design and test • AR for TEST: Test design rules  for design // n.b.: memory test is totally different animal • AR for TEST: set a constraint on Design (in terms of test cost, etc.) • AR for TEST: send to DESIGN folks all comments on test_column_feb01.ppt • AR for TEST: power delivery, dI/dt, etc. constraints from Test that may impact Design • AR for DESIGN AND TEST: need to split Design, Test material among three chapters (System Drivers, Design, Test) • AR for DESIGN: describe System Drivers chapter to Test • AR for TEST: pass along SOC test material • AR for DESIGN AND TEST: define the proper group of liaison folks (Mike, Tom, Tim, Rob, …?) Should also set up processes/schedule for interaction and convergence.

More Related