310 likes | 496 Views
SHARPE Toolflow. Jayanand Asok Kumar. Statistical timing. Signal arrival time is statistical Depends on applied input vector Affected by process variations Static Timing Analysis (STA) Design for Worst-case timing Conservative clock rates Statistical STA (SSTA)
E N D
SHARPE Toolflow Jayanand Asok Kumar
Statistical timing • Signal arrival time is statistical • Depends on applied input vector • Affected by process variations • Static Timing Analysis (STA) • Design for Worst-case timing • Conservative clock rates • Statistical STA (SSTA) • Allow for some input patterns to violate the timing constraint • Faster clock for the design • Tradeoff: Computational errors vs Speed A “better-than-worst case” design methodology is advocated
SSTA approach • Conventional analysis • SSTA at gate-level • Iterative process to meet timing requirements • Shift analysis to RT level • Better design choices early in the design cycle • Metric of interest • For a given timing specification, T “ What is the probability that delay at a module output is less than T ?” • Defined as Critical probability A high-level SSTA approach is proposed Synthesis RTL design Gate-level netlist
What is SHARPE? • SHARPE is Statistical High level Analysis and Rigorous Performance Estimation SHARPE is our CAD toolflow for formal SSTA in RTL • We consider delay variations due to input patterns 4
SHARPE toolflow RTL modules/blocks Obtain RTL signal probabilities Obtain RTL delay models from gate-level Obtain RTL delay distribution Compute critical probability
Inner details of SHARPE Obtain RTL signal probabilities Obtain RTL delay models from gate-level Obtain RTL delay distribution Compute critical probability
Delay macromodeling Express delay as a function of RTL signals
RTL Delay modeling • Delay is a gate-level artifact • Need to shift the delay models to RT level • RTL assignment statements are considered • Delay macromodel • Delay of RTL operator = function of operands (i.e., inputs) • A library of macromodels is constructed • One-time characterization effort RTL operator Simulate gate-level implementation Tabulate delay vs. input vector Find f: Delay = f(inputs) Multiple implementations Library Other RTL operators
Delay of an RTL statement • Delay is measured from rising edge of clock • 2 types of assignment statements • Can use more sophisticated functions than max( )
Source Code Static Analysis Primary inputs assumed to be independent Propagate probabilities from primary inputs
Running Example: RTL code input A,B,C; wire D,E,F,G; output X; always@ (posedge clk) begin D <= A; E <= B + C; F <= D & E; G <= B & C; X <= F + G; end We want to find critical probability at X
Static analysis D <= A; E <= B + C; G <= B & C; Shared variables with same time annotation t-1 A B C B C G D and E are independent E and G are correlated t D E Time Annotation D(t) = A(t-1) E(t) = B(t-1) + C(t-1) G(t) = B(t-1) & C(t-1) F <= D & E; t-2 A B C B,C shared with different time annotations F can be expressed in terms of D and E B C t-1 D E Time Annotation F and G are independent G t F
Optimization • Step back only till all reached signals are independent • Compact representation of X in terms of F, G • Avoid stepping back till primary inputs A,B and C t-3 A B C F and G are independent Time Annotation t-2 D E B C t-1 F G t X
Generating RTL-DTMCs Assume knowledge of probability distributions of primary inputs
RTL-DTMCs • RTL represented as Discrete Time Markov Chains (DTMCs) • Called RTL-DTMCs • State variables of RTL-DTMCs • Independent signals obtained from static analysis • Transition probability = Product of state variable probabilities • In each state of RTL-DTMC, X = F + G • Relation obtained from static analysis • Probabilities for F and G need to be computed first • Smaller RTL-DTMC than by using A,B,C as state variables RTL-DTMC for signal X
Generating RTL-Delay-DTMCs RTL-Delay-DTMCs model statistical delay distribution
Reward model • RTL-Delay-DTMCs • Timing expressed as a function of state variables • Each state represents timing for one input vector • Tag states where timing < constraint T • Expected value of reward • Equivalent to critical probability
PRISM Probabilistic Model Checking Critical probability is the required timing invariant
Computing critical probability • Compute expected value of reward at time N • Simulation-based methods • Time-consuming and incomplete • Probabilistic model checking • Has never been applied to hardware designs • Explores all possible paths of length N • For RTL designs • Computations converge for a small value of N • N = 3 for combinational and N = 10 for sequential variables • PRISM, a symbolic probabilistic model checker, is used
Experiments • OR1200 processor • We consider timing analysis only for datapath • MSB of 6-bit ALU output is considered • We assign primary inputs distributions Computed signal probabilities are verified to be accurate
Macromodels • Delay macromodels are obtained • NANGATE standard cell library is used • Adder • Ripple carry implementation • Delay prediction error < 1% • Macromodels obtained also for • Bitwise AND, OR, XOR • MUX blocks
SHARPE • RTL_1: Synchronized inputs to ALU • SHARPE is compared against gate-level simulations • Timing constraints expressed as percentage of Worst-Case Timing (WCT) Maximum deviation of 6.45% among given constraints
SHARPE • RTL_2: Inputs to ALU have skewed arrival times • Delay model for blocking statements is used Maximum deviation of 6.07% among given constraints
Revision of RTL design • Compare different gate-level implementations • Ripple-carry adder: Critical probability of 99.92% (90% WCT) • Carry lookahead adder: Only 93.2% (90% WCT) Introduce pipelining Same as RTL_1 ! RTL_2 SHARPE SHARPE Critical probability = 0.9531 @ 70%WCT Does not meet timing specification Critical probability = 0.9772 @ 70%WCT Meets timing specification
Future work • Improve scalability of SHARPE • Incorporate automated decomposition methods • Include other sources of randomness • Process variations • Soft errors
SHARPE: Backup Slides
Selecting a macromodel X<=F+G Operator = ‘+’ Implementation = ‘Ripple-carry’ Ripple-carry ‘+’ Delay = f1(adder inputs) Carry lookahead ‘+’ Delay = f2(adder inputs) Select delay macromodel Delay at X = f1(F,G) Delay Macromodel Library
Next step • We have shifted delay entirely to RTL • Delay modeled as a function of RTL signals • RTL signal probabilities need to be computed • We assume probability distributions of primary inputs • Probabilities are propagated to other signals • Next step: • Source code static analysis
Stochastic independence • Notation • Signal(t): Signal at time instant t • Assumptions regarding independence • Primary inputs are stochastically independent • Probability distributions are independent across time instants • Signal(t1) and Signal(t2) are independent if t1≠ t2 • Two signals are independent if • They have no shared signals that have the same time annotation • Independent signals are identified during static analysis
Reward model • Reward • Cost associated with being in a state • Tag states of interest • States where X = i • i is one of the possible values that X can take • Expected value of reward • Equivalent to probability that X=i , i.e., p(X=i) • Repeat for all values of i