340 likes | 610 Views
Dynamic Power Analysis of Custom Macros. Stephen Bijansky Bassam Mohd Baker Mohammad. Outline. Motivation HSIM Power Analysis ESP-CV Power analysis ESP-CV Flow Results Conclusions. Motivation. Power characterization is an important part of low power design
E N D
Dynamic Power Analysis of Custom Macros Stephen Bijansky Bassam Mohd Baker Mohammad
Outline • Motivation • HSIM Power Analysis • ESP-CV Power analysis • ESP-CV Flow • Results • Conclusions
Motivation • Power characterization is an important part of low power design • Custom macro with transistor level design has a challenge to model active power • Spice level simulation is slow • Characterizing all custom cells is a big task • Need a detail gate level model to use ASIC design flow • Changes in the top level affects macro power • Need on going modeling of power with new stimulus
Overview • Power estimation for custom macros • Transistor level schematics • Post-layout capacitance extraction • Reduce analysis time • Improve accuracy for long test cases • This work is used extensively in Qualcomm’s 45nm low power DSPs
Traditional Approach (.lib) • Fast SPICE simulator HSIM • Assume certain activities on data • Append power into lib files • conditional statements based on control signals • Limitation on conditional statement • Mutually exclusive • Depends on internal state nodes • Has the macro just come out of reset or has the macro been running for a while • Potential 2M+N entry in the lib file
Cont. Traditional Approach (HSIM) • Fast SPICE simulator • Accuracy within 2% to 3% of HSPICE • Use HSIM to run the entire power benchmark • Power benchmark might be thousands of cycles • Potential for long run time • Large macros could take days or weeks • Reduce benchmark to only 100 cycles • Which 100 cycle window should be used • Power analysis could be too large or too small • Can be time consuming and error prone to set initial conditions
1st Order Power Equation Power = Activity Factor * Cap * Voltage2 * Freq • Capacitance – LPE • Voltage – Fixed • Frequency – Fixed • Activity Factor – Unknown
ESP-CV Simulation • Symbolic equivalence checking of schematics vs RTL • Input to ESP-CV is a standard Verilog testbench • Use ESP-CV as a Verilog simulator for schematics • Verilog simulation orders of magnitude faster than Spice • Functional simulation • Only need to determine activity factor
RC verilog switch-level simulator G S D “Gold standard” For Accuracy “Extremely Fast” No Timing “Functional Accuracy” Automated Modeling “High Performance” For Accuracy HSPICE VCS HSIM ESPCV
ESP-CV Simulation • ESP-CV converts schematic to switch level verilog • Special directives for transistor strengths • Internal node names in a custom macro are not in RTL • ESP-CV uses the internal nodes in the schematic • Run entire benchmarks using thousands of cycles • Same benchmarks used in PT-PX for power estimation of synthesized logic • Includes reset and initialization • Fast run time allows running many more benchmarks
Flow Steps • Input to the Flow • Spice netlist • VCD on the macro boundaries • Cap file • Output : Power Value in W • Integrate the flow with PTPX chip level run
RTL Simulation and Testbench Creation • Entire benchmark is simulated for the top level design • Verilog VCS simulation • Starts from reset, performs initialization, then benchmark • Single fsdb dump file for each benchmark • Vtran converts the fsdb dump of the benchmark to a Verilog testbench • Macro testbench has all of the same inputs as the top level simulation
Calculate Activity Factor • Process ESP-CV VCD dump file and calculate an activity factor for each node • Vcd2saif produces a switch activity interchange format (SAIF) file • Time spent at 0/1/Z, numbers of transitions, … • Computed for only the window of interest • Process the SAIF file to get the activity factor for each node • Transitions / Number of cycles
Node Capacitances • Calibre layout parasitic extraction (LPE) • Nanotime calculates the total cap of every node • Reads Calibre SPEF file • Add gate, diffusion, and wire caps • Qcs_process_cap_rpt.pl • Converts Nanotime report to an easy to use column based text file format • For nodes, such as bitlines, that do not have a full rail swing, the caps can be scaled
Calculate Power • Qcs_calc_power.pl • Combines switching activities with the capacitances to compute the power • Voltage and frequency are fixed • Output is a text file with the power, activity factor, capacitance, and name for each node • Easily sort to determine which nodes use the most power • Retains hierarchy easy to filter • Can partition to determine power on multiple supply nets
100 Cycle Validation • Run ESP-CV with the same 100 cycle window that is used for HSIM • For tests that use more than 1 mW of power, ESP-CV is within 3% of the HSIM • For tests that use less than 1 mW, ESP-CV is within 0.08 mW of HSIM • ESP-CV has good correlation to HSIM
Results • 100 cycles do not accurately model an entire test • Test3 reported 4.7X more power using 100 cycles compared to the entire test • Test4 reported 55% less power using 100 cycles compared to the entire test • Difficult to choose a good 100 cycle window
Run Time Comparison • ESP-CV full test simulations • Test3 with 49,101 cycles took 406 seconds • Test4 with 240,510 cycles took 3267 • Event based simulations scales with the number of cycles • ESP-CV 100 cycle simulations needed 21 seconds • Not many events in 100 cycles • HSIM needed between 1,950 seconds (Test5) and 9,468 seconds (Test2) to run 100 cycles • Large differences in run time with fixed number of cycles
IR Drop Analysis • Compute fixed activity factor power for use in Redhawk IR drop analysis • Every clock nodes is assigned an activity factor of 100% • Every non-clock node is assigned an activity factor of 15% which is 3 transitions per every 10 clock cycles • This is worst case analysis that is used to stress the power grid to see where are the weak points
Conclusion • Simulate an entire benchmark instead of trying to guess at a subset of the benchmark • The wrong subset led to a 4.7X overestimation of power • Includes reset and initialization • Fast simulation enables running more benchmarks • ESP-CV is being used to generate power estimations of longer benchmarks
Future Work • Short circuit power modeling • Current flow does not address • Leakage power modeling • Active leakage power is not accurately modeled • Enable other methods to calculate node capacitances • More calibration on different circuit families
Thank You! Questions
Nanotime Capacitance Report # max rise CAP NODE : clk C_diff : 0.000 C_overlap : 0.004 C_gate : 0.003 C_wire : 0.081 C_pin : 0.006 C_total : 0.094 # max rise CAP NODE : xblock/lclk C_diff : 0.000 C_overlap : 0.013 C_gate : 0.012 C_wire : 0.093 C_pin : 0.009 C_total : 0.127
Process Capacitance Report %nodeCap = (); while ($line = <CAPFILE>) { if ($line =~ /^NODE : (\S+)/) { $node = $1; $line = <CAPFILE>; $line = <CAPFILE>; $line = <CAPFILE>; $line = <CAPFILE>; $line = <CAPFILE>; $line = <CAPFILE>; if ($line =~ /^C_total\s*:\s*(\S+)/) { $ctotal = $1; $nodeCap{$node} = max ($ctotal, $nodeCap{$node}; } } }