310 likes | 522 Views
Timing Yield-Aware Color Reassignment and Detailed Placement Perturbation for Double Patterning Lithography. Mohit Gupta, Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu ECE Department University of California, San Diego. Outline.
E N D
Timing Yield-Aware Color Reassignment andDetailedPlacement Perturbation for Double Patterning Lithography Mohit Gupta, Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu ECE Department University of California, San Diego
Outline • Bimodal CD Distribution in DPL • Impact on design timing • Mitigating Impact of Bimodal CD Distribution • Bimodal-Aware Timing Library • Optimization 1: Color Reassignment (Max Alternation) • Optimization 2: Placement Perturbation (DPL-Correctness) • Experimental Framework and Results • Impact of Color Reassignment • Impact of Placement Perturbation • Conclusion
Bimodal CD distribution in DPL • Two patterning steps Two different CDs • Two different colorings Twodifferent timings Lines from 1st patterning Linesfrom 2nd patterning C12: ODD polys in BLUE, EVEN polys in GREEN C21: ODD polys in GREEN, EVEN polys in BLUE Jeong et al. ASPDAC’09 C12-type cell C21-type cell Gates from CD group1 Gates from CD group2
Impact of Bimodality on Guardband • Comparison of design guardband (Min-Max delay) • FACT 1: Unimodal representation is too pessimistic! Large CD group Small CD group CD mean difference Jeong et al. ASPDAC’09
Impact of Bimodality on Path Delay • By definition, 2(x+y) = 2(x)+ 2(y)+ 2 cov(x,y) • Delay variation of a timing path, • Since cov(d(gi),d(qj)) cov(d(gi),d(gj))or cov(d(qi),d(qj)),variation of bimodal distribution is smaller than unimodal distribution • Simulation results validated • FACT 2: Alternate (mixed)coloring has smaller delayvariation! Sigma / Mean (%) Jeong et al. ASPDAC’09
Launch capture Impact of Bimodality on Clock Skew • Different coloring sequences in a clock network Clock skew • FACT 3: Same color on all clock buffers is better! Case2 Clock skew (s) Case1 Jeong et al. ASPDAC’09
Bimodal CD Distribution: 3 Key Facts 1. Design requires bimodal-aware timing models • Unimodal representation is too pessimistic 2. Data paths benefit from alternate (mixed) coloring • Exploit existence of two uncorrelated CD populations • Minimize correlated variations in a given path 3. Clock paths benefit from uniform coloring • Correlated variation between launch and capture paths minimizes bimodality-induced clock skew
DPL Layout-to-Mask Flow RTL-to-GDS DPL Mask Coloring Bimodal-Aware Timing Analysis Optimization 1 ILP to Maximize Alternate Coloring (Datapaths) Optimization 2 Placement Perturbation for Color Conflict Removal (Clock and Datapaths)
Outline • Bimodal CD Distribution in DPL • Impact on design timing • Mitigating Impact of Bimodal CD Variation • Bimodal-Aware Timing Library • Optimization 1: Color Reassignment (Max Alternation) • Optimization 2: Placement Perturbation (DPL-Correctness) • Experimental Framework and Results • Impact of Color Reassignment • Impact of Placement Perturbation • Conclusion
Bimodality-Aware Timing Model and Analysis • Timing model • Two timing libraries: • G1L-G2S: group1 has larger CD than group2 • G1S-G2L: group1 has smaller CD than group2 • Two coloring versions of a cell in each library • C12: leftmost poly is in group1 • C21: leftmost poly is in group2 • CD Mean difference • Chosen from process information • E.g., 2nm, 4nm and 6nm • Timing analysis • For each CD mean difference, check timing slack using each of timing libraries G1L-G2S andG1S-G2L • Worse timing between G1L-G2S andG1S-G2L librariesis regarded as the actual worst-case timing G1 G2 G2 G1
Optimization 1: Maximum Alternate Coloring • Maximize alternate (mixed) coloring Minimize delay variation • How to quantify alternation of coloring sequence? New metric: Coloring Sequence Cost (CSC) • Represents delay variation due to the coloring
VDD MP2 MP1 Delay and Coloring A1 A2 ZN MN1 • Rise delay depends on PMOS tr. ~10% variation • Fall delay depends on both NMOS trs. ~ 1% variation MN2 VSS VDD VDD G1L-G2S A1 A1 A2 A2 G1S-G2L MP2 MP2 MP1 MP1 ZN ZN MN2 MN2 MN1 MN1 VSS VSS
VDD MP2 MP1 Coloring Sequence Cost (CSC) for NAND2 A1 A2 ZN • Two observations • Activated transistors determine the delay • The impact on delay is averaged when more than one transistor are activated • Assign CSC for single transistor • Group1: −1 (CSCMP1 = CSCMN1 = −1) • Group2: +1 (CSCMP2 = CSCMN2 = +1) • CSC for NAND2 gate • A1ZN rise (by MP1): -1 • A2ZN rise (by MP2): 1 • A2ZN fall (by MN1 and MN2): (1 + -1) / 2 = 0 • A1ZN fall (by MN1 and MN2) (-1 + 1) / 2 = 0 MN1 MN2 VSS VDD A1 A2 MP1 MP2 ZN MN2 MN1 VSS
CSC Calculation for Cells - Examples VDD VDD MP2 MP1 A1 A2 MP1 MP2 MP3 Z A Z VSS MN1 MN2 MN1 A1 MN3 VSS A2 MN2 VDD VDD A1 A2 A MP3 MP1 MP2 MP1 MP2 Z Z MN2 MN3 MN1 MN2 MN1 AND2 gate VSS VSS • A1Z fall: {-1} + (-1) = -2 • A1Z rise: {(-1 + 1) / 2} + (-1) = -1 • A2Z fall: {1} + (-1) = 0 • A2Z rise: {(-1 + 1) / 2} + (-1) = -1 • AZ fall : -1 + 1 = 0 • AZ rise : -1 + 1 = 0 BUFFER gate
Coloring Sequence Cost for Path (CSCP) • CSCP = Sum of CSC values of stages in path, weighted by stage delay (Di) • CSCPi = • Correlation between CSCP and delay variation • 1,300 different colorings of a timing path • CSCP metric is strongly correlatedwith delay variation of timing paths • Correlation coefficient: 0.902 • CSCP reduction Delay variation reduction l : timing arc in a path i
Maximization of Alternate Coloring • Optimal timing path coloring problem: • Given a set of timing-critical paths: P • Color each cell in union of timing paths to minimize • ILP to minimize maximum CSCP • Objective: • Subject to:
Impact of Alternate Coloring Optimization • Alternate coloring improves timing slack and reduces timing variation: JPEG 70% utilization case • TNS improves by 11% ~ 27% TNS(ns): Initial coloring TNS(ns): Alternate coloring TNS (ns)
Optimization 2: Placement Perturbation • DPL feasibility: distance between same-color polys must be larger than minimum resolution • Coloring assignment from Optimization 1 can introduce additional coloring conflicts into an existing layout • Placement perturbation for DPL-Correctness dpb: distance from poly to cell boundary Resmin: minimum resolution 2dpb > Resmin > Resmin Coloring conflict Logical connection (a) Original placement (c) Conflict removal (b) Alternate coloring
DP Using Cost of Coloring Conflicts • HCost: Horizontal placement cost under constraints • Cost of placing a cell “a” to a placement site “b” • Considers the spacing between poly lines in different cells spacing = xa + b + LPSa − (xa−1 + i + wa−1 − RPSa−1) (b: displacement of cell a to site b) • HCost is defined as: If ((spacing < Rmin) && (LPCa == RPCa−1)) • HCost(a, b, a − 1, i) = Otherwise • HCost(a, b, a − 1, i) = 0 Rightmost-Poly of cell a-1 Leftmost-Poly of cell a LaPS Ra-1PS Ra-1PC =0 RaPC =1 LaPC =0 La-1PC =1 wa-1 wa xa-1 xa
Two Dynamic Programming Approaches • DP Algorithm 1: SHIFT • Minimize total displacement cost, considering HCost • DP Algorithm 2: SHIFT+RECOLOR • Necessary when high utilization blocks Algorithm 1 • Performs simultaneous recoloring of non-timing critical cells • Cost is defined for each color of cell instances, e.g., C12 and C21 • Other DP variants: MAX, FLIP *Timing criticality weight for displacement
Outline • Bimodal CD Distribution in DPL • Impact on design timing • Mitigating Impact of Bimodal CD Variation • Bimodal-Aware Timing Library • Optimization 1: Color Reassignment (Max Alternation) • Optimization 2: Placement Perturbation (DPL-Correctness) • Experimental Framework and Results • Impact of Color Reassignment • Impact of Placement Perturbation • Conclusion
Experiment Framework Placed and routed design (SOC Encounter) orig.def Initial Coloring initial_colored.def Timing Analysis (PrimeTime - SI) ILP Instance Optimization 1 slack.list Optimal Coloring (Alternate Coloring maximization) keep_color.list opt_colored.def Optimization 2 Conflicts Removal (SHIFT, SHIFT+RECOLOR) opt.def
Optimization 1: Max Alternate Coloring • Testcases with 45nm Nangate Open Cell Library 59% reduction 85% reduction Opt. Opt. Opt. Opt. Opt. Opt. Init. Init. Init. Init. Init. Init. 2nm 4nm 6nm 2nm 4nm 6nm
Optimization 2: Placement Perturbation • #CC (#coloring conflicts), SDT (sum of displacement of timing-critical cells), SDNT (sum of displacement of nontiming-critical cells), #RC (# recolored cells) • All SHIFT runtimes for JPEG are 204-354 seconds • All SHIFT+RECOLOR runtimes are 578-678 seconds
Overall Timing Improvement • Bimodal timing model Reduce pessimism • Alternate coloring Improve timing • Placement perturbation Remove conflicts The impact of bimodality can be effectively mitigated!
Conclusion • Contributions • New CSC metric to represent the timing variation in double patterning • ILP-based color reassignment to improve timing slack and variation • DP-based placement perturbation to remove coloring conflicts after color reassignment • Results (45nm Nangate Open Library) • Up to 232ps WNS reduction and 36.22ns TNS reduction • WNS variation reduction from 380ps to 84ps • TNS variation reduction from 64ns to 22ns • Ongoing work • More accurate metrics for timing path color balancing to enhance timing quality • Golden DPL timing and placement optimizer based on simultaneous timing-aware coloring and conflict removal
Property(2): Clock Skew and Timing Slack • Timing slack calculation • Timing slack: • Timing slack variation: • Clock skew • Especially, clock skew from uncorrelated launching and capturing clock paths are the major source of timing slack variation. • Example Large correlation is better for timing slack Data (10 – 5 = 5ns) BC Clock (10 – 5 = 5ns) Data (10 2 = 8~12ns) Worst slack = 5 5 = 0ns Clock (10 2 = 8~12ns) Data (10 + 5 = 15ns) WC Worst slack = min(clock) – max(data) = 8 12 = 4ns Clock (10 + 5 = 15ns) Worst slack = 15 15 = 0ns • Worst slack in DPL • Small delay variation • but large negative slack (b) Worst slack in single exp. Large delay variation but zero slack
Simulation Setup: Skew and Slack • Testcase • AES from Opencores, Nangate 45nm library, PTM 45nm • Extracted critical path • Exhaustive tests (4 x 254) not feasible, so we fix the data path coloring. Data path: 30 stages Clock launch: 14 stages Clock capture: 14 stages
Experiments on Clock Skew and Timing Slack • Clock skew • Even for the zero mean difference case, clock skew exists and increases with mean difference • Pooled unimodal can not distinguish this clock skew • Timing slack • Originally zero slack turns out to besignificant negative slack • Pooled unimodal shows very pessimistic slack 53ps 22ps