990 likes | 1.17k Views
Futures for DSM Physical Implementation: Where is the Value, and Who Will Pay?. Andrew B. Kahng abk@cs.ucla.edu , http://vlsicad.cs.ucla.edu UCLA Computer Science Department 12th DA Show, Tokyo July 14, 2000. Subwavelength Optical Lithography. Subwavelength Gap since .35 m.
E N D
Futures for DSM Physical Implementation: Where is the Value, and Who Will Pay? Andrew B. Kahng abk@cs.ucla.edu , http://vlsicad.cs.ucla.edu UCLA Computer Science Department 12th DA Show, Tokyo July 14, 2000
Subwavelength Optical Lithography Subwavelength Gap since .35 m Numerical Technologies, Inc.
$10 $3 $1 “The Design Productivity Gap” Potential Design Complexity and Designer Productivity Equivalent Added Complexity Logic Tr./Chip Tr./S.M. 68 %/Yr compounded Complexity growth rate 21 %/Yr compound Productivity growth rate “How many gates can I get for $N?” 3 Yr. Design YearTechnologyChip ComplexityFrequencyStaffStaff Cost* • 250 nm 13 M Tr. 400 MHz 210 90 M • 250 nm 20 M Tr. 500 270 120 M • 180 nm 32 M Tr. 600 360 160 M • 2002 130 nm 130 M Tr. 800 800 360 M Source: SEMATECH * @ $ 150 k / Staff Yr. (In 1997 Dollars)
Outline • Future DSM physical implementation technologies • design closure • design-manufacturing interface • Valuations • the significance of design productivity and design quality • structural aspects of the EDA industry • Values • toward maturity and a design productivity renaissance • Conclusions: Who Will Pay ?
Outline • Future DSM physical implementation technologies • design closure • design-manufacturing interface • Valuations • the significance of design productivity and design quality • structural aspects of the EDA industry • Values • toward maturity and a design productivity renaissance • Conclusions: Who Will Pay ?
What is design closure? user constraints RTL synthesis netlist logic optimization/ timing verif placement routing layout “front end consistent with back end” meet constraints here Û meet constraints there What is the problem ? source: K. Keutzer, DAC 2000
ARISTO RTL Verilog Hard Blocks Concurrent Block Synthesis Block Shaping, Compaction & Concurrent Port Placement “Olympic Flame” Aristo, DAC-2000 panel TYPICAL DESIGN FLOW Gate-Level Verilog Library IP Blocks Design Constraints Design Netlist Concurrent Block Partitioning, Clustering & Placement Early Planning Gate-Level Optimization Design Refinement Gate-Level Place & Route Top-Level Routing Chip Assembly RC Extraction Timing Analysis PREDICTABLE HIERARCHICAL DESIGN CONVERGENCE
RTL statistical WLM Behavioral / RTL synthesis timing library Timing logic Route Place Increasing Modeling Detail “Recycle Bin” Physical Prototyping Design Signoff GDSII Monterey, DAC-2000 panel
“Anakin Skywalker’s Pod Racer” 3D Extraction Prepare Database Timing Sign-off Delay True-3D Calculation Parasitics Place Timing Timing RTL Sequence & Synthesis Analysis Analysis Route Interconnect Interconnect Driven Driven Optimization Optimization Driver sizing,topology-based optimization Sequence, DAC-2000 panel
Clear Thinking: Basics of Design Convergence • What must converge ? • logic, timing, and spatial embedding • support front-end signoff, provide predictable back-end • Ways to achieve Convergence through Predictability • correct by construction (“assume, then enforce”) • constraints and assumptions passed downstream; not much goes upstream • ignores concerns via guardbanding • separates concerns as able (e.g., FE logic/timing vs. BE spatial embedding) • construct by correction (“tight loops”) • logic-layout unification; synthesis-analysis unification, concurrent optimization • elimination of concerns • reduced degrees of freedom, pre-emptive design techniques • e.g., power distribution, layer assignment / repeater rules, GALS/LIS
What Must A Design Closure Tool Look Like ? • Input • RT-level HDL + technology + constraints • Output • “go”: recipe for invocation and composition of “commodity” SP&R • “no go”: diagnosis of RTL code problems • Logical and physical hierarchies co-evolve • spatial: top-down coarse placement physical hierarchy • logic/timing: implementable RTL logical hierarchy • limits of human fanout, organizations always have hierarchy • natural sequence of no-floorplanning, phys-floorplanning, RTL-floorplanning... • Details (must construct, predict, ignore, eliminate, ...) • pin optimizations, interconnect planning, hierarchy reconciliations, budgeting mechanisms, compatibility with downstream SP&R, ...
DON’T Develop This RTL Planning Technology • Don’t spend too much time packing blocks that will change • goal = early diagnosis, or handoff to commodity SP&R • pre-synthesis uncertainty = +/- 15% area, timing • wirelength, path timing ® must be connectivity-centric, not packing-centric • easier to work on direct realizations of the floorplan, not representations • need relative coarse placement that adapts to incremental ECOs • Don’t over-constrain block shaping (rectangles, L’s, T’s) • placers handle constraints w/ granularity = site spacing, row height • constructive pin assignment ® don’t need roundness • path timing optimization ® may even want disconnected shapes • Don’t under-constrain layout region • fixed-die planning: simultaneous zero-whitespace, zero-overlap
Do Allow the Following... 1.0 0.5,0.5 1.0 Blk A Blk B
Do Develop This RTL Planning Technology • RTL partitioning • understand interaction b/w block definition and placement quality • recognize and cure a physically challenged logic hierarchy • Global interconnect planning and optimization • symbolic route representations to support block plan ECOs • Controllable SP&R back end (including power/clock/scan) • Incremental / ECO optimizations, and optimizations that are “robust” under partial or imperfect design knowledge • Better estimators (“initial WLMs”) • to account for resource, topological heterogeneity • to account for optimizations (placement, ripup/reroute, timing) • “earliest RTL signoff with detailed P&R knowledge”
Conclusion • RTL-to-GDSII will commoditize SP&R market sectors • Many solutions are reasonable and will survive in the marketplace RTL-down SP&R becomes a “commodity” • No solution is complete • Key missing pieces include RTL partitioning; hierarchy and block management; real working RTL diagnosis and signoff • Individual point technologies (e.g., global placement or detailed routing) become less valuable integration is most important
Outline • Future DSM physical implementation technologies • design closure • design-manufacturing interface • Valuations • the significance of design productivity and design quality • structural aspects of the EDA industry • Values • toward maturity and a design productivity renaissance • Conclusions: Who Will Pay ?
Subwavelength Optical Lithography Subwavelength Gap since .35 m Numerical Technologies, Inc.
OPC Corrections With OPC No OPC Original Layout Optical Proximity Correction (OPC) • Corrective modifications to improve process control • improve yield (process window) • improve device performance
Future OPC-Related Technologies • WYSIWYG broken ® (mask) verification bottleneck • Function-aware OPC insertion • OPC insertion is for predictable circuit performance, function • tool understands functional intent, makes only the corrections that win $$$, reduce performance variation • applies to mask inspection as well • OPC- and manufacturing-aware layout • don’t make corrections that can’t be manufactured or verified • model effects of geometry on OPC cost needed to yield function • understand (data volume, verification) costs of breaking hierarchy • Difficult solutions to flow issues • e.g., how to avoid making same corrections 3x (library, router, PV)
conventional mask phase shifting mask glass Chrome Phase shifter 0 E at mask 0 0 E at wafer 0 0 I at wafer 0 Phase Shifting Masks (PSM)
Double-Exposure Bright-Field Alternating PSM • Positive photoresists for poly, metal unexposed areas = printed features 0 + = 180 180
Why is Alternating PSM Valuable and Essential ? • PSM enables smaller transistor gate lengths Leff • “critical” polysilicon features only (gate Leff) • faster device switching ® faster circuits • better critical dimension (CD) control ® better parametric yield, $/wafer • Full-chip PSM (poly, local interconnect) denser layouts • smaller die area ® more $/wafer • achieving Roadmap for device density depends on PSM • Data points • 25 nm gates manufactured with 248nm DUV steppers (NTI + MIT Lincoln Labs, June 2000) • 90nm gates in production at Motorola, Lucent since 1999 • Alternative: $5 B fab with equipment that doesn’t exist yet
The Phase Assignment Problem • Assign 0, 180 phase regions such that critical features with width < B are induced by adjacent phase regions with opposite phases 0 180
Key: Global 2-Colorability • Odd cycle of “phase implications” ® layout cannot be manufactured • layout verification becomes a global, not local, issue ? 180 0 180 180 0 180
Critical features: F1,F2,F3,F4 F2 F4 F1 F3
F2 F4 F1 F3 Opposite-Phase Shifters (0,180)
S3 F2 S4 S8 F4 S7 S1 F1 S2 S5 F3 S6 Shifters: S1-S8 PROPER Phase Assignment: • Oppositephases for opposite shifters • Same phase for overlapping shifters
S3 F2 S4 S8 F4 S7 S1 F1 S2 S5 F3 S6 Phase Conflict Proper Phase Assignment is IMPOSSIBLE
Phase Conflict Resolution S3 F2 S4 S8 F4 S7 S1 F1 S2 S5 F3 S6 Phase Conflict feature shifting to remove overlap
Phase Conflict Resolution S3 F2 S4 S8 F4 S7 S1 F1 S2 F3 Phase Conflict feature widening to turn conflict into non-conflict
Future PSM-Related Technologies • UCLA-Cadence: first comprehensive methodology for AltPSM layout design • 3-way shared responsibility for phase-assignability • good layout practices (local geometry) • no T shapes, no doglegs, even-length transistor fingers, ... • but no complete set of “rules” exists • automatic phase conflict resolution (global 2-colorability) • latest technology: optimal conflict resolution for 50K polygons in 6 sec • reuse of layout (free composability) • problem: guarantee reusability of phase-assigned layouts, such that no odd cycles can occur when the layouts are composed together in a larger layout • Changes all flows: library design, custom design, SP&R
Macroscopic Process Effects Dummy Fill controls several types of process distortions : CMP, SOG RIE CVD R. Pack, Cadence
Lens Towards Lens Wafer Plane Edge: High Aberrations Center: Minimal Aberrations Field-Dependent Aberration • Field-dependent aberrations cause placement errors and distortions R. Pack, Cadence
Conclusions • RTL-to-GDSII commoditizes existing SP&R market sectors • Design-manufacturing interface will change EDA • Closely related to foundry capital expenditure • Unites EDA with much of mask industry, even process development • Expands scope of physical “verifications”, moves awareness upstream into “syntheses” (logic, layout) • Very comprehensive changes to data model, infrastructure, flows • Unified, front-to-back solutions will win
Outline • Future DSM physical implementation technologies • design closure • design-manufacturing interface • Valuations • the significance of design productivity and design quality • structural aspects of the EDA industry • Values • toward maturity and a design productivity renaissance • Conclusions: Who Will Pay ?
$10 $3 $1 The Productivity Gap Potential Design Complexity and Designer Productivity Equivalent Added Complexity Logic Tr./Chip Tr./S.M. 68 %/Yr compounded Complexity growth rate 21 %/Yr compound Productivity growth rate “How many gates can I get for $N?” 3 Yr. Design YearTechnologyChip ComplexityFrequencyStaffStaff Cost* • 250 nm 13 M Tr. 400 MHz 210 90 M • 250 nm 20 M Tr. 500 270 120 M • 180 nm 32 M Tr. 600 360 160 M • 2002 130 nm 130 M Tr. 800 800 360 M Source: SEMATECH * @ $ 150 k / Staff Yr. (In 1997 Dollars)
O(25 mask levels) ~ “$1M mask set” in 130nm Mask Cost But: average only 500 wafers per mask set !
“Keep the Fabs Full” • Design technology must keep manufacturing facilities fully utilized with: • high-volume parts • high-margin parts • Foundry capital cost > $2B • How much value of new designs is needed to fill the fab ???
Application / Behavior SW/HW Implementation Gap Design Entry Level Level of Abstraction Gate-level “platform” RTL Today Tomorrow Mask Effort/Value Design Productivity Need + DSM = 2 EDA Trends source: MARCO GSRC
Application SW/HW Design Entry Level Hand-off “platform” RTL Mask Fab Amortization Close the Implementation Gap Level of Abstraction Effort/Value source: MARCO GSRC
Design Productivity Gap Low-Value Designs? Percent of die area that must be occupied by memory to maintain SOC design productivity Source = Japanese system-LSI industry
V S V G S S S V S V S V G S S • G S V Reduce Back-End Effort ? Example: repeating dense wiring fabric pattern at minimum pitch - Eliminates signal integrity, delay uncertainty concerns - But has at least 60% - 80% density cost source: MARCO GSRC
P1 P3 P2 P4 P5 Pearls (the IP Processes) MicroShells (the IP Requirements) P6 MacroShells (the Protocol Interface) P7 Communication Channels Improve IP Reuse Productivity ? source: MARCO GSRC
QUALITY Problem : > 1000x Energy-Flexibility Gap 1000 100-200 MOPS/mW Dedicated HW 100 10-50 MOPS/mW ReconfigurableProcessor/Logic Energy Efficiency MOPS/mW (or MIPS/mW) 10 ASIPs DSPs 1 V DSP 3 MOPS/mW 1 Embedded mProcessors LP ARM 0.5-2 MIPS/mW 0.1 Flexibility (Coverage) Source: Prof. Jan Rabaey, UC Berkeley
“Keep the Fabs Full” • Design technology must keep manufacturing facilities fully utilized with: • high-volume parts • high-margin parts • What happens when design technology “fails” ? • not enough high-value designs • the semiconductor industry will find a “workaround” • reconfigurable logic • platform-based design
System Application Application Compilation Simple & Direct Sophisticated Compiler Architecture Structured Custom RTL Flow FPGA FPGA & GPP Config. Processor DSP GPP Microarchitecture Once per Application Platform Compilation Once per Family Silicon Process Platform-Based Design source: MARCO GSRC
Conclusions • RTL-to-GDSII commoditizes existing SP&R market sectors • Design-manufacturing interface will change EDA • Design productivity gap threatens design quality ASIC business model is at risk • TAT achieved at cost of QOR • low QOR low silicon value • electronics industry chooses reprogrammable, platform-based “workarounds”