220 likes | 350 Views
Explicit Modeling of Control and Data for Improved NoC Router Estimation. Andrew B. Kahng +* , Bill Lin * and Siddhartha Nath + UCSD CSE + and ECE * Departments { abk , billlin , sinath }@ eng.ucsd.edu. Outline. Motivation Our work: Overview Methodology
E N D
Explicit Modeling of Control and Data for Improved NoC Router Estimation Andrew B. Kahng+*, Bill Lin* and Siddhartha Nath+ UCSD CSE+ and ECE* Departments {abk, billlin, sinath}@eng.ucsd.edu
Outline • Motivation • Our work: Overview • Methodology • Flit-level power estimation • Summary
NoC Modeling So Far… (ORION) Arbiter SRC BUF I SINK Link BUFE Link XBAR Link BUFW Link Link BUFN Link Link BUFS Link Leakage power ORION1.0 (2002) ORION2.0 (2009) Clock power 6NOR + 2INV + DFF 6NOR + 2INV + DFF
What Is The Problem? Arbiter SRC BUF I SINK Link BUFE Link XBAR Link BUFW Link Link BUFN Link Link BUFS Link 6NOR + 2INV + DFF RTL code mismatch Logic transformation and technology mapping mismatch
How Bad Is It? Router RTL generators: Netmaker – Cambridge, UK Stanford NoC - Stanford 460% 89% • Why such large errors? • Assumed logic template inaccurate • Control logic not modeled • Implementation details missing
Outline • Motivation • Our work: Overview • Methodology • Flit-level power estimation • Summary
We Propose: Step 1 • Derive router component block parametric models from post-synthesis netlists ~P2 ~F ~P2 XBAR ~ P2F P - #Ports V - #VCs B - #BUFs F – Flit-width • Key idea: No assumed logic template • Component models derived from actual RTL synthesized with cell libraries
We Propose: Step 2 XBAR ~ P2F XBARarea = a1.P2F + a0 LSQR • Key idea: Capture implementation details using automatic regression fit • Characterization performed only once and usable for multiple design space explorations Automatic fitting of models with post-P&R power and area
Outline • Motivation • Our work: Overview • Methodology • Flit-level power estimation • Summary
Model Development NoC router RTL generators µArch params: P, V, B, F Implparams: Clock Frequency • Two RTL generators: • Netmaker (Cambridge, UK) • Stanford NoC • SP&R tools: • Cadence RC & Synopsys DC for hierarchical synthesis to analyze each block • Cadence SOC Encounter for P&R Synthesis and P&R: DC/RC, SOCE Analysis of blocks: XBAR, SW & VC arbiter, Input & Output buffers New models for each component block
Overall Methodology ORION_NEW models Technology Library Post P&R data per block Basic Regression fit Std. cell count & area Cell area Cell leakage Leakage power Manual LSQR Pin cap. Internal power Internal energy Switching power Estimates for gate count Area Power: leakage, internal, switching • LSQR • Accurate (captures implementation details) • One-time overhead (generation of P&R training data points) • Manual • Quick and easy • Misses implementation details
Results: Area And Power AREA POWER 4xreduction 6.5xreduction Methodology scales across technologies, router RTL generators
Outline • Motivation • Our work: Overview • Methodology • Flit-level power estimation • Summary
Flit-level Power Estimation Post-P&R router netlist Power analysis Gate-level simulation Testbench VCD ORION_NEW models Power Report Regression fit Flit-level power model GARNET gem5 Flit-level power estimates Dynamic power estimation using flit-level bit encodings Have integrated with full-system NoC simulator (GARNET)
Results: Flit-level Power 3.6xreduction • Accurate estimation of flit-level dynamic power
Outline • Motivation • Our work: Overview • Methodology • Flit-level power estimation • Summary
Summary • New hybrid modeling methodology: relax the template mindset • Explicitly models control and data signals • Captures RTL and implementation details • Using proposed parametric regression methodology, worst-case estimation errors reduced by a factor of • 6.5x from ORION2.0 for power • 4x from ORION2.0 for area • We propose an application of our methodology for flit-level dynamic power modeling and integration with GARNET • 3.6x worst-case error reduction in dynamic power estimation • Ongoing: Non-parametric modeling of post-P&R power and area
Regression analysis approach a1. Instsmodel <component> + a0 = Inststool <component> InstsRmodel<component> = a1. Instsmodel <component> + a0 • Step 2a: Fit area of each router component with post-layout area b1. InstsRmodel <component> + b0 = Areatool <component> • Step 2b: Fit power of each router component with post-layout power (leakage, internal, switching separately) {c5, d5, e5}. InstsRmodel XBAR + {c4, d4, e4}.InstsRmodel SWVC + {c3, d3, e3}.InstsRmodel InBUF + {c2, d2, e2}.InstsRmodel OutBUF + {c1, d1, e1}.InstsRmodel CLKCTRL + {c0, d0, e0}= {Pleaktool,Pint tool, PSW tool} • Multi-step regression fit • Step 1: Fit instances of each router component with post-layout instance counts
Related work NoC Modeling • Architecture templates • ORION2.0 • Gate-level analytical models • Parametric regression • Pre- and post-layout power estimation • RTL simulations • Non-parametric regression • MARS Circuit model Regression model Analytical Arch templates Parametric Non-parametric Control ORION_NEW + regression; flit-level Tool • Significant Departure: Relax the “template” mindset
Results • Avg. estimation error in # instances reduced from 109.5% to 8.8% • Avg. estimation error in area reduced to 9.8% • Avg estimation error in power reduced to 4.58%