340 likes | 482 Views
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems. Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego Presented by: SeokHyeong Kang. Outline. Motivation Our Work Metamodeling Background Hybrid Surrogate Modeling (HSM)
E N D
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego Presented by: SeokHyeong Kang
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Estimation in IC Design Problems • Combinatorial explosion in parameters • Microarchitectural • E.g., NoC flit-width, #buffers, #VCs, #Ports • Operational • E.g., workload activity factor, supply voltage • Design implementation • E.g., core area, tool knobs, constraints • Technology • E.g., library, corners • Manufacturing • E.g., guardbands
Why Surrogate Modeling? • Implications of large parameter space • Complex interactions between parameters • Difficult to capture effects in closed-form analytical model • Surrogate models can be accurate • Models derived from actual physical implementation data • High accuracy demonstrated in previous works e.g., Samadi10, Nath12
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Axes of Our Studies • Modeling techniques • Multivariate Adaptive Regression Splines (MARS) • Radial Basis Functions (RBF) • Kriging (KG) • Hybrid Surrogate Modeling (HSM) • Resource Metrics • Number of dimensions (D) • number of samples (N) • Sampling strategies • Latin Hypercube Sampling (LHS) • Adaptive Sampling (AS) • Quality-of-Results Metrics • Maximum and average percentage errors
Our IC Design Estimation Problems • Network-on-Chip (NoC) • Estimate: area and power • Dimensionality: low • Parameters: microarchitectural and implementation • Power Delivery Network (PDN) • Estimate: cell delay and slew in presence of PDN noise • Dimensionality: high • Parameters: implementation and technology • Clock Tree Synthesis (CTS) • Estimate: wirelength and buffer area of clock trees • Dimensionality: high • Parameters: implementation and technology
Key Contributions • Demonstrate accuracy limits of popular metamodeling techniques as D increases • RBF and KG are preferred at low-D • MARS is preferred at high-D • Demonstrate application of Adaptive Sampling (AS) to reduce errors and sample set sizes • Up to 1.5x reduction in worst-case estimation errors • Up to 1.2xreduction in sample set size • Present Hybrid Surrogate Modeling (HSM) to achieve up to 3x reduction in worst-case estimation error
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Brief Background on Metamodeling • General form of estimation where, Predicted response deterministic response Random noise function Regression coefficients
Metamodel Classification • Tree-based • MARS • Gaussian process-based • RBF • KG • We use cross-validation to make modelsgeneralizable
Regression Function: MARS where, Ii : # interactionsin the ith basis function bji: ±1 xv: vth parameter tji: knot location Knot = value of parameter where line segment changes slope
Regression Function: RBF where, aj: coefficients of the kernel function K(.): kernel function µj: centroid rj: scaling factors
Regression Function: KG where, R(.): correlation function (Gaussian, linear, spherical, cubic, …) : correlation function parameter
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Multicollinearity at High-D • If is a linear combination of one or more ’s • Matrix (N x D) of parameters ’s is ill-conditioned • Large variance in ’s • Proper relationship between ’s and is hard to determine • Impact on estimation results • Large errors between and as Dincreases • Diagnostic tests to detect multicollinearity • Variance Inflation Factor (VIF) • F-test • ANOVA
Hybrid Surrogate Modeling • “Cure” adverse effects of multicollinearity as D increases • Variant of Weighted Surrogate Modeling but uses least-squares regression to determine weights where, w1: weight of predicted response of surrogate model for MARS w2 : weight of predicted response of surrogate model for RBF w3 : weight of predicted response of surrogate model for KG
Metamodeling Flow Generate golden data points Generate test data points Derive model (MARS/RBF/KG/…) Generate training samples (LHS, AS) Surrogate models Estimate response Compute model accuracy
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Latin Hypercube Sampling • Sample uniformly (“exploration”) across parameter space • Only 5 samples Error
Adaptive Sampling • Sample using “exploration” and “exploitation” across parameter space • Only 5 samples Error
Results of Our PDN Studies • AS reduces • error by 1.5x compared to LHS for same #samples • #samples by1.2xcompared to LHS for same % error ~1.5x in error ~1.2x in #samples
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Experimental Setup: NoC (Low-D) • Metrics to estimate • Total area of standard cells and total power • Parameters • Microarchitectural: # Ports, #VCs, #Buffers, Flit-Width Implementation: Clock frequency • Others • Technology libraries: TSMC65GPLUS and TSMC45GS • SP&R Tools: Synopsys Design Compiler and Cadence SOC Encounter • Router RTL: Netmaker from Cambridge University • Methodology • Perform SP&R with above tools and parameters • Extract post-P&R area and power • Derive surrogate models
Maximum Estimation Error: NoC (Low-D) • With a training sample set size of 36 data points • RBF and KG (Gaussian process-based) have in general 1.5x less error than MARS (tree-based) • HSM can have up to 3x less error than MARS RBF, KG and HSM are highly accurate at low-dimensions
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
Experimental Setup: PDN (High-D) • Metrics to estimate • Cell delay and slew • Parameters • Implementation: • Cell: cell size, load capacitance, input slew, body bias • PDNnoise: noise amplitude, noise slew, noise offset • Corner: temperature, process-performance ratio • Technology: supply voltage, threshold voltage • Others • Technology libraries: TSMC65GPLUS • Tool: Synopsys HSPICE • Netlist: 10-stage INV chain • Methodology • Perform SPICE simulation with above parameters • Extract delay and slew of cells • Derive surrogate models
Maximum Estimation Error: PDN (High-D) D = • With training sample set size of 700 data points • MARS and HSM have 3x less error than RBF with ridge regression • At D = {8, 9}, MARS and HSM have similar accuracy, because other models have large average errors MARS and HSM are highly accurate at high-dimensions
Experimental Setup: CTS (High-D) • Metrics to estimate • Wirelength and total buffer area • Parameters • Implementation: #sinks, buffer type, max. # levels, core area, max. skew, max. delay • Technology: max. buffer size, max. buffer and sink transition times, max. wire widths • Others • Technology libraries: TSMC65GPLUS and TSMC45GS • Tool: Cadence SoC Encounter • Testcase: Uniformly placed sinks • Methodology • Perform CTS with SOC Encounter and above parameters • Extract wirelength and buffer area of clock trees • Derive surrogate models
Maximum Estimation Error: CTS (High-D) D = • With training sample set size of 84 data points • HSM has up to 3x less error than all other surrogate models • Errors grow with D in MARS, RBF, KG due to multicollinearity HSM remains highly accurate at high-dimensions
Outline • Motivation • Our Work • Metamodeling Background • Hybrid Surrogate Modeling (HSM) • Sampling Strategies • Low-dimension: NoC • High-dimension: PDN-Noise, CTS • Conclusions
IC Design Modeling Guidelines D > 5? N Y All VIF values < 0.33? All VIF values < 0.33? N Y Y N Estimates with small µ & σ2? Y N Try MARS Try HSM/RBF/ KG Try HSM/MARS/RBF/KG Try MARS Try HSM/MARS/RBF/ RBF+RR/KG
Conclusions • Metamodeling techniques can be effective for IC design estimation problems • We study three problems along multiple axes • NoC, PDN, CTS • Quality and resource metrics, modeling techniques and sampling strategies • We use AS and demonstrate • 1.5x reduction in error vs. LHS • 1.2x reduction in sample size vs. LHS • We propose Hybrid Surrogate Modeling (HSM) to “cure” multicollinearity at high dimensions. • HSM can be up to 3x more accurate than MARS, RBF and KG at low- and high-dimensions • Ongoing: (1) Techniques to reduce multicollinearity, (2) dimensionality reduction, and (3) application to other IC physical design problems