1 / 31

SHARPE Toolflow

SHARPE Toolflow. Jayanand Asok Kumar. Statistical timing. Signal arrival time is statistical Depends on applied input vector Affected by process variations Static Timing Analysis (STA) Design for Worst-case timing Conservative clock rates Statistical STA (SSTA)

aira
Download Presentation

SHARPE Toolflow

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SHARPE Toolflow Jayanand Asok Kumar

  2. Statistical timing • Signal arrival time is statistical • Depends on applied input vector • Affected by process variations • Static Timing Analysis (STA) • Design for Worst-case timing • Conservative clock rates • Statistical STA (SSTA) • Allow for some input patterns to violate the timing constraint • Faster clock for the design • Tradeoff: Computational errors vs Speed A “better-than-worst case” design methodology is advocated

  3. SSTA approach • Conventional analysis • SSTA at gate-level • Iterative process to meet timing requirements • Shift analysis to RT level • Better design choices early in the design cycle • Metric of interest • For a given timing specification, T “ What is the probability that delay at a module output is less than T ?” • Defined as Critical probability A high-level SSTA approach is proposed Synthesis RTL design Gate-level netlist

  4. What is SHARPE? • SHARPE is Statistical High level Analysis and Rigorous Performance Estimation SHARPE is our CAD toolflow for formal SSTA in RTL • We consider delay variations due to input patterns 4

  5. SHARPE toolflow RTL modules/blocks Obtain RTL signal probabilities Obtain RTL delay models from gate-level Obtain RTL delay distribution Compute critical probability

  6. Inner details of SHARPE Obtain RTL signal probabilities Obtain RTL delay models from gate-level Obtain RTL delay distribution Compute critical probability

  7. Delay macromodeling Express delay as a function of RTL signals

  8. RTL Delay modeling • Delay is a gate-level artifact • Need to shift the delay models to RT level • RTL assignment statements are considered • Delay macromodel • Delay of RTL operator = function of operands (i.e., inputs) • A library of macromodels is constructed • One-time characterization effort RTL operator Simulate gate-level implementation Tabulate delay vs. input vector Find f: Delay = f(inputs) Multiple implementations Library Other RTL operators

  9. Delay of an RTL statement • Delay is measured from rising edge of clock • 2 types of assignment statements • Can use more sophisticated functions than max( )

  10. Source Code Static Analysis Primary inputs assumed to be independent Propagate probabilities from primary inputs

  11. Running Example: RTL code input A,B,C; wire D,E,F,G; output X; always@ (posedge clk) begin D <= A; E <= B + C; F <= D & E; G <= B & C; X <= F + G; end We want to find critical probability at X

  12. Static analysis D <= A; E <= B + C; G <= B & C; Shared variables with same time annotation t-1 A B C B C G D and E are independent E and G are correlated t D E Time Annotation D(t) = A(t-1) E(t) = B(t-1) + C(t-1) G(t) = B(t-1) & C(t-1) F <= D & E; t-2 A B C B,C shared with different time annotations F can be expressed in terms of D and E B C t-1 D E Time Annotation F and G are independent G t F

  13. Optimization • Step back only till all reached signals are independent • Compact representation of X in terms of F, G • Avoid stepping back till primary inputs A,B and C t-3 A B C F and G are independent Time Annotation t-2 D E B C t-1 F G t X

  14. Generating RTL-DTMCs Assume knowledge of probability distributions of primary inputs

  15. RTL-DTMCs • RTL represented as Discrete Time Markov Chains (DTMCs) • Called RTL-DTMCs • State variables of RTL-DTMCs • Independent signals obtained from static analysis • Transition probability = Product of state variable probabilities • In each state of RTL-DTMC, X = F + G • Relation obtained from static analysis • Probabilities for F and G need to be computed first • Smaller RTL-DTMC than by using A,B,C as state variables RTL-DTMC for signal X

  16. Generating RTL-Delay-DTMCs RTL-Delay-DTMCs model statistical delay distribution

  17. Reward model • RTL-Delay-DTMCs • Timing expressed as a function of state variables • Each state represents timing for one input vector • Tag states where timing < constraint T • Expected value of reward • Equivalent to critical probability

  18. PRISM Probabilistic Model Checking Critical probability is the required timing invariant

  19. Computing critical probability • Compute expected value of reward at time N • Simulation-based methods • Time-consuming and incomplete • Probabilistic model checking • Has never been applied to hardware designs • Explores all possible paths of length N • For RTL designs • Computations converge for a small value of N • N = 3 for combinational and N = 10 for sequential variables • PRISM, a symbolic probabilistic model checker, is used

  20. Experiments • OR1200 processor • We consider timing analysis only for datapath • MSB of 6-bit ALU output is considered • We assign primary inputs distributions Computed signal probabilities are verified to be accurate

  21. Macromodels • Delay macromodels are obtained • NANGATE standard cell library is used • Adder • Ripple carry implementation • Delay prediction error < 1% • Macromodels obtained also for • Bitwise AND, OR, XOR • MUX blocks

  22. SHARPE • RTL_1: Synchronized inputs to ALU • SHARPE is compared against gate-level simulations • Timing constraints expressed as percentage of Worst-Case Timing (WCT) Maximum deviation of 6.45% among given constraints

  23. SHARPE • RTL_2: Inputs to ALU have skewed arrival times • Delay model for blocking statements is used Maximum deviation of 6.07% among given constraints

  24. Revising the design choice

  25. Revision of RTL design • Compare different gate-level implementations • Ripple-carry adder: Critical probability of 99.92% (90% WCT) • Carry lookahead adder: Only 93.2% (90% WCT) Introduce pipelining Same as RTL_1 ! RTL_2 SHARPE SHARPE Critical probability = 0.9531 @ 70%WCT Does not meet timing specification Critical probability = 0.9772 @ 70%WCT Meets timing specification

  26. Future work • Improve scalability of SHARPE • Incorporate automated decomposition methods • Include other sources of randomness • Process variations • Soft errors

  27. SHARPE: Backup Slides

  28. Selecting a macromodel X<=F+G Operator = ‘+’ Implementation = ‘Ripple-carry’ Ripple-carry ‘+’ Delay = f1(adder inputs) Carry lookahead ‘+’ Delay = f2(adder inputs) Select delay macromodel Delay at X = f1(F,G) Delay Macromodel Library

  29. Next step • We have shifted delay entirely to RTL • Delay modeled as a function of RTL signals • RTL signal probabilities need to be computed • We assume probability distributions of primary inputs • Probabilities are propagated to other signals • Next step: • Source code static analysis

  30. Stochastic independence • Notation • Signal(t): Signal at time instant t • Assumptions regarding independence • Primary inputs are stochastically independent • Probability distributions are independent across time instants • Signal(t1) and Signal(t2) are independent if t1≠ t2 • Two signals are independent if • They have no shared signals that have the same time annotation • Independent signals are identified during static analysis

  31. Reward model • Reward • Cost associated with being in a state • Tag states of interest • States where X = i • i is one of the possible values that X can take • Expected value of reward • Equivalent to probability that X=i , i.e., p(X=i) • Repeat for all values of i

More Related