580 likes | 932 Views
The Design-Manufacturing Roadmap Andrew B. Kahng UC San Diego CSE & ECE Departments http://vlsicad.ucsd.edu. Outline. The Design Roadmap DFM: Symptoms, Problem, Solution DFM Futures: Some Examples. Big Picture. Message: Cost of Design threatens continuation of the semiconductor roadmap
E N D
The Design-Manufacturing RoadmapAndrew B. KahngUC San Diego CSE & ECE Departmentshttp://vlsicad.ucsd.edu
Outline • The Design Roadmap • DFM: Symptoms, Problem, Solution • DFM Futures: Some Examples
Big Picture • Message: Cost of Design threatens continuation of the semiconductor roadmap • Design cost model • Challenges are now Crises • Strengthen bridge from semiconductors to applications, software, architectures • Hertz and bits are not the same as efficiency and utility • System Drivers chapter, with productivity and power foci • Strengthen bridges among ITRS technologies • “Shared red bricks” can be solved (or, worked-around) more cost-effectively • “Manufacturing Integration” cross-cutting challenge • “Living ITRS” framework to promote consistency validation
“Living ITRS” Framework • “Living roadmap”: internally consistent, transparent models as basis of ITRS predictions • ORTCs: Models for layout density, system clock speed, total system power in various drivers, circuit fabrics • Visualization tool (at Sematech website) for capture, exploration of ITRS models under alternative scenarios • “Is --- worth it?”
Design Challenges - Silicon • Silicon Complexity = impact of process scaling, new materials, new device/interconnect architectures • Non-ideal scaling (leakage, power management, circuit/device innovation, current delivery) • Coupled high-frequency devices and interconnects (signal integrity analysis and management) • Manufacturing variability (library characterization, analog and digital circuit performance, error-tolerant design, layout reusability, static performance verification methodology/tools) • Scaling of global interconnect performance (communication, synchronization) • Decreased reliability (SEU, gate insulator tunneling and breakdown, joule heating and electromigration) • Complexity of manufacturing handoff (reticle enhancement and mask writing/inspection flow, manufacturing NRE cost)
Design Challenges - System • System Complexity = exponentially increasing transistor counts, with increased diversity (mixed-signal SOC, …) • Reuse (hierarchical design support, heterogeneous SOC integration, reuse of verification/test/IP) • Verification and test (specification capture, design for verifiability, verification reuse, system-level and software verification, AMS self-test, noise-delay fault tests, test reuse) • Cost-driven design optimization (manufacturing cost modeling and analysis, quality metrics, die-package co-optimization, …) • Embedded software design (platform-based system design methodologies, software verification/analysis, codesign w/HW) • Reliable implementation platforms (predictable chip implementation onto multiple fabrics, higher-level handoff) • Design process management (team size / geog distribution, data mgmt, collaborative design, process improvement)
Design Chapter Outline • Introduction • Scope of design technology • Complexities (silicon, system) • Design Cross-Cutting Challenges • Productivity • Power • Manufacturing Integration • Interference • Error-Tolerance • Details of five traditional technology areas: Design Process, System-Level, Logical/Physical/Circuit, Functional Verification, Test • Key 2003 changes • Increased analog and circuits content • Refinement of design cost metrics • Design system architecture and flow • SEU and reliability
Design Technology Crises Incremental Cost Per Transistor Test Manufacturing Manufacturing • 2-3X more verification engineers than designers on microprocessor teams • Software = 80% of system development cost (and Analog design hasn’t scaled) • Design NRE > 10’s of $M manufacturing NRE $1M • Design TAT = months or years manufacturing TAT = weeks • Without DFT, test cost per transistor grows exponentially relative to mfg cost SW Design NRE Cost Turnaround Time Verification HW Design
Challenge: “Manufacturing Integration” • Goal: share red bricks with other ITRS technologies • Lithography CD variability requirement new Design techniques that can better handle variability ? • Mask data volume requirement new Design-Mfg interfaces and flows that pass functional requirements, verification knowledge to mask writing and inspection ? • ATE cost and speed red bricks new DFT, BIST/BOST techniques for high-speed I/O, signal integrity, analog/MS ? • Can technology development reflect ROI (value / cost) analysis: Who should solve a given red brick? • Shared Red Bricks
Example: Manufacturing Test • High-speed interfaces (networking, memory I/O) • Frequencies on same scale as overall tester timing accuracy • Heterogeneous SOC design • Test reuse • Integration of distinct test technologies within single device • Analog/mixed-signal test • Reliability screens failing • Burn-in screening not practical with lower Vdd, higher power budgets overkill impact on yield • Design Challenges: DFT, BIST • Analog/mixed-signal • Signal integrity and advanced fault models • BIST for single-event upsets (in logic as well as memory) • Reliability-related fault tolerance
Example: Lithography • 10% CD uniformity requirement causes red bricks • 10% < 1 atomic monolayer at end of ITRS • This year: Lithography, PIDS, FEP agreed to relax CD uniformity requirement (but we still see red bricks) • Design challenge: Design for variability • Novel circuit topologies • Circuit optimization (conflict between slack minimization and guardbanding of quadratically increasing delay sensitivity) • Centering and design for $/wafer • Design challenge: Design for when devices, interconnects no longer 100% guaranteed correct • Can this save $$$ in manufacturing, verification, test costs?
Outline • The Design Roadmap • DFM: Symptoms, Problem, Solution • DFM Futures: Some Examples
Symptoms: Routing Rules (1) • Minimum area rules and via stacking • Stacking vias through multiple layers can cause minimum area violations (alignment tolerances, etc.) • Via cells can be created that have more metal than minimum via overlap (used for intermediate layers in stacked vias) • Multiple-cut vias • Use multiple-cut vias cells to increase yield and reliability • Can be required for wires of certain widths • Multiple via cut patterns have different spacing rules • Four cuts in quadrilateral; five cuts in cross; six cuts in 2x3 array; … • With wide-wire spacing rules, complicates pin access • Cut-to-cut spacing rules check both cut-to-cut and metal-to-metal when considering via-to-via spacing • Line-end extensions • Vias or line ends need additional metal overlap (0th-order OPC)
Symptoms: Routing Rules (2) • Width- and Length-dependent spacing rules • Width-dependent rules: domino effects • Variant: “parallel-run rule” (longer parallel runs more spacing) • Measuring length and width: halo rules affect computation • Influence rules or stub rules • A fat wire, e.g., power/ground net, will influence the spacing rule within its surroundings any wire that is X um away from the fat wire needs to be at least Y um away from any other geometry. • Example: fat wire with thin tributaries • bigger spacing around every wire within certain distance of the thin tributaries • ECO insertion of a tributary causes complications • Strange jogs and spreading when wires enter an influenced area
Symptoms: Routing Rules (3) • Density • Grounded metal fills (dummy fill*) • Via isodensity rules and via farm rules (via layers must be filled and slotted, have width-dependent spacing rule analogs, etc.) • Non-rectilinear (-geometry) routing • X-Architecture: http://www.xinitiative.org/ • Y-Architecture: http://vlsicad.ucsd.edu/Yarchitecture/ , LSI Logic patents • Landing pad shapes (isothetic rectangle vs. octagon vs. circle), different spacings (~1.1x) between diagonal and Manhattan wires, etc. • More exceptions • More non-default classes (timing, EM reliability, …) • Not just power and clock • >0.25um width may be “wide” many exceptions
Symptoms: Routing Rules • Degrade completion rates, runtime efficiency • “Postprocessing” likely no longer suffices • E.g., antennas • There is no chip until the router is done • Must / Should / Can tomorrow’s IC routers “independently” address these issues?
Corollaries of Moore’s Law Number of design rules per process node • Data volume, mask write time explosion • RET layers explosion • Design rules explosion:
Whose Job Is It To Solve: • Mask NRE cost ( runtimes shapes complexity) • BEOL catastrophic yield loss • Deposited copper can infer yield loss mechanisms • Open faults more prevalent than short or bridging faults • High-resistance via faults • Cf. “non-tree routing” for reliability and yield? • Variability budget for planarization • Copper is soft dual-material polish mechanisms • Oxide erosion and copper dishing cross-sectional variability, inter-layer bridging faults, … • Low-k: thermal properties, anisotropy, nonuniformity • Resistivity at small conductor dimensions
The Problem: Evolution • Conflicting goals • Designer: “freedom”, “reuse”, “migration” • EDA: “maintenance mode” • Process/foundry: “enhance perceived value” (= add rules) • Prisoner’s Dilemma: who will invest in change? • Fiddling: Incremental, linear extrapolation of current trajectory • “GDS-3” • Thin post-processing layers (decompaction, RET insertion, …) • Leads to “dark future” (12th Japan DA Show keynote)
DAC-2003 Nanometer Futures Panel:Where should extra R&D $ be spent?
The Solution: Co-Evolution • Designer, EDA, and process communities cooperate and co-evolve to maintain the cost (value) trajectory of Moore’s Law • Must escape Prisoner’s Dilemma • Must be financially viable • At 90nm to 65nm transition, this is a matter of survival for the worldwide semiconductor industry • Example Focus Areas: • Explicit manufacturability and cost/value optimization • Restricted layout • Intelligent mask data prep • “Analog” (not binary) rules • (Many layout and design optimizations) • Disclaimer: Not a complete listing
Example: Today’s RET Flow Design Rules Device Models Library (Library Team) Litho/Process (Tech. Development) Layout & libs (Corner CaseTiming) RET Mask: Dataprep (Mask House) Design (ASIC Chip) Layout (collection of polygons ?) Tapeout Guardbanding all the way in all stages!! (e.g. clock ACLV guardband ~ 30%) • What do we lose ? • Performance Too much worst-casing • Turnaround time Huge OPC runtimes, overdesign • Predictability RET is applied post-design • Mask costs Overcorrection • Designer’s intent RET is not driven by design
Foundation of the DFM Solution • Bidirectional design-manufacturing data pipe • Fundamental drivers: cost, value • Pass functional intent to manufacturing flow • Example: RET for predictable timing slack, leakage, yield • RETs should win $$$, reduce performance variation • cost-driven, parametric yield constrained RET • Pass limits of manufacturing flow up to design • Example: avoid corrections that cannot be manufactured or verified e.g., design should be aware of metrology N.B.: 1998-2003 papers/tutorials: http://vlsicad.ucsd.edu/~abk/TALKS/
Outline • The Design Roadmap • DFM: Symptoms, Problem, Solution • DFM Futures: Some Examples
#1: Design for Value* • Mask cost trend Design for Value (DFV) Design for Value Problem: Given • Performance measure f • Value function v(f) • Selling points ficorresponding to various values of f • Yield function y(f) MaximizeTotal Design Value = i y(fi)*v(fi) [or,MinimizeTotal Cost] • Probabilistic optimization regime * See "Design Sensitivities to Variability: Extrapolation and Assessments in Nanometer VLSI", IEEE ASIC/SoC Conference, September 2002, pp. 411-415.
Obvious Step: Function-Aware OPC • Annotate features with “required amount” of OPC • E.g., why correct dummy fill? • Determined by design properties such as setup and hold timing slacks, parametric yield criticality of devices and features • Reduce total OPC inserted (e.g., SRAF usage) • Decreased physical verification runtime, data volume • Decreased mask cost resulting from fewer features • Supported in data formats (OASIS, IBM GL-I, OA/UDM) • Design through mask tools need to make, use annotations • N.B.: General RET trajectory: rules models libraries
DFV in OPC Regime Given: Admissible levels of (OPC) correction for each layout feature, and corresponding delay impact (mean and variance) Find: Level of correction for each layout feature, such that a prescribed selling point delay is attained Objective:Minimize total cost of corrections
Variation-Aware Library Models • Each capacitance or delay value replaced by (,) pair • Variation aware .lib pin(A) { direction : input; capacitance : (0.002361,0.0003) ; } … timing() { related_pin : "A"; timing_sense : positive_unate; cell_rise(delay_template_7x7) { index_1 ("0.028, 0.044, 0.076"); index_2 ("0.00158, 0.004108, 0.00948"); values ( \ “(0.04918,0.001), (0.05482,0.0015), (0.06499,0.002)", ….
Correction = Mask Cost = CD Control • Levels of RET = Levels of CD control • Levels of RET = levels of CD control OPC solutions due to K. Wampler, MaskTools, March 2003 CD studies due to D. Pramanik, Numerical Technologies, December 2002
Nominally Correct SP&R Netlist Min. Corrected Library SSTA Yield Target met ? Y EXIT N All Correction Libraries Correction Algorithm All Correction Libraries SSTA Generic SSTA-Based Cost of Correction Methodology • Statistical STA (SSTA) provides PDFs of arrival times at all nodes • Assume variation aware library models (for delay) are available • Statistical STA currently has runtime and scalability issues
MinCorr: Parallels to Gate Sizing • Assume • Gaussian-ness of distributions prevails • + 3 corresponds to 99% yield • Perfect correlation of variation along all paths • Die-to-Die variation • 1+2 + 31+2 = 1 + 31 + 2 + 32 • Resulting linearity allows propagation of (+3) or 99% (selling point) delay to primary outputs using standard Static Timing Analysis (STA) tools • (See DAC-2003 paper)
MinCorr delay (+k) costs of correction selling point delay cost of OPC MinCorr: Parallels to Gate Sizing Gate Sizing Problem: Given allowed areas and corresponding delays of each cell, minimize total diearea subject to a cycle time constraint
MinCorr Methodology (DAC-03) • Mapping of area minimization to RET cost optimization • “Yield library” analogous to timing libraries (e.g., .lib) • Synthesis tool (Design Compiler) performs “gate sizing” • Figure counts, critical dimension (CD) variations derived from Numerical Technologies OPC tool* • Restricted TSMC 0.13 m library (7 cell masters: BUF, INV, NAND, NOR) • Approach tested on small combinational circuits • alu128: 8064 cells • c7552: 2081 cell ISCAS85 circuit • c6288: 2769 cell ISCAS85 circuit • Up to 79% reduction in figure complexity without any parametric yield impact
Library-Based OPC • OPC applied post-tapeout • Overcorrection (matching corners) mask cost • Large runtimes • Impact of OPC on performance unknown • Designer’s intent OPC quality metrics • CD (Poly over active) • Non-critical poly needs less control • Contact Coverage • “Perfect” corners not needed if there is enough contact overlap
Library Based OPC • Idea: Dataprep each cell once per definition (during library generation) rather than once per placement • Model-based OPC very compute-intensive • reduce runtime and data size by (#cell placements)/(#cell definitions) (~100s to millions) • Radius of influence for 193nm light is about 600nm • Most cells have 200-300nm empty space at the boundaries distance to nearest poly line in any placement > 400nm • Small loss in accuracy for fingers at the periphery of the library cell • Post-OPC GDSII much smaller in size • Impact of RET predictable during design: characterize library cells post-OPC can prevent a lot of guardbanding, avoid intricate OPC 640nm 760nm
Experiments: Environment • Need to emulate a “typical” environment for the cell in a placement • Border Poly: 160nm from outline • Affects final CD • Top-Bottom Poly: 70nm from outline • Affects contact coverage, mask rule violations • Contact Poly: depends on contacts • Affects contact coverage, mask rule violations
Results: Average CD CD error: Library-OPC vs Full Chip OPC N-i% denotes % of devices with less than i% error w.r.t. full-chip OPC Library OPC Runtime is 90 seconds for 10 masters
#2: Process-Aware Design • Anisotropy in H vs. V bias • Features in one direction (scanning, raster write, …) may be better controllable than those in the orthogonal direction • Single orientation throughout layout is preferred • Dominant (critical-feature) orientation in layout design should match write direction • Wafer symmetries (e.g., etch gradient due to spin-on) • Iso-Dense balancing (imaging through focus)
Systematic ACLV • ACLV = Through-pitch variation (50%) + Topography variation (10%) + Mask variation + Etch, residuals • Current timing analysis (statistical or deterministic STA) assumes all variation is ‘random’ • 50% of ACLV can be predictable by analyzing layout • “Smile-frown” plots indicate: • Through focus variation is systematic • Corners for timing analysis are derived from worst-case ACLV tolerance instance specific tolerances are much tighter Figure courtesy ASML MaskTools
Taming Pattern and Focus Variation • Obtain a set of nominal CD (wafer image simulation) for typical environments of the cell in a chip environment specific timing libs (typical ASIC libs very limited set of environments) • Run in-context STA (post-placement) with context-specific timing libs accurate nominal timing at zero focus condition • Input to output delay modeling based on the iso-ness and dense-ness of transistors in the input to output paths more accurate delay variation analysis in STA
Example of Smile-Frown Aware STA • If all timing arcs frown, then the path delay will always decrease through focus one corner is trimmed off ! • If slopes of smile/frown curves are known circuit sensitivity to focus variation can be computed + = + +
#3: Intelligent MDP + Mask Write • MDP driven by (write error * MEEF) = wafer CD error • Partitioning into multiple gray-scale writing passes • Apertures, beam currents, dwell times, shot ordering, … • EDA tools define stripe, major field, subfield boundaries! • Electrical / functional defect criteria
#4: Mask Write Optimizations • Conflicting goals: resolution, CD control, throughput • Resist heating = large contributor to mask CD variation • Knobs: beam current, flash size, idle times, grayscale passes • Subfield writing order = example new knob • Reduced heating increased beam current density • Reduced dwell time compensates for travel and settling time Ordering #1 Ordering #2 • Ordering #2 is “self-avoiding” lower pre-flash temps
Max 48.85C Mean 27.59C Sequential schedule “Self-Avoiding” Subfield Order for Mask Write • SPIE Microlithography ’03, Photomask Japan ’03 • Simulation of subfield temperatures within a main deflection field for sequential vs. greedily optimized writing schedules Max 32.68C Mean 16.07C Greedily optimized schedule
#5: Fill Parametric Yield Impact • Performance Impact Limited Fill (PIL-Fill), DAC-2003 • Fill adds capacitance, hurts timing and SI closure • Plain capacitance minimization objective is not sufficient • CMP modeling layout density vs. dimensions built into RLCX 1 top view Active lines 2 A B C Active lines 3 w fill grid pitch D E 4 5 buffer distance F G 6
Min-Slack, Fill-Constrained PIL-Fill • Inputs: LEF/DEF, extracted RSPF, STA (slack) report • Drive ILP and greedy PIL-Fill methods by estimated lateral coupling and Elmore delay impact • Baseline comparison = LP/Monte-Carlo methods • Iterated greedy method for MSFC PIL-Fill reduces timing slack impact of fill by 80% (average over all nets), 63% (worst net)
#6: Analog Rules • We don’t need no $#(*&(! “rules” • Rules just make lithographers feel better (?) • Ultimately, bottom line is cost of ownership, TCOG • Given adequate models of MDP, RET and Litho flows, design tools can and should optimize parametric yield, $/wafer, profits • More examples: critical-area reduction by decompaction, introducing redundancy (vias, wires), … • Automated learning of models and “implicit rules” • Current approach: test wafers, test structures, second-hand understanding • Future: machine learning techniques
Phase Shifters 180 0 #7: Restricted Layout Dual Exposure Result Islands Checkerboard • “Soft reset” = 1-time hit on Moore’s Law density scaling • Restricted Design Rules (“RDR”) can be compensated many ways • embedded 1-T SRAM fabric, stacking, I/O circuit design, … • N.B.: Moore’s Law is a “meta” Law! Example: PhasePhirst! (Levenson et al.) 0 180 Transparent Opaque Trim Mask Exposure First Exposure Dark-Field PSMs or M. D. Levenson, 2003