Power Estimation

Power Estimation Dr. Elwin Chandra Monie

Abstraction, Complexity, Accuracy

Spice Simulation • Circuit/device level analysis • Circuit modeled as network of transistors, capacitors, resistors and voltage/current sources. • Node current equations using Kirchhoff’s current law. • Average and instantaneous power computed from supply voltage and device current. • Analysis is accurate but expensive • Used to characterize parts of a larger circuit.

Gate-Level Power Analysis • Pre-simulation analysis: • Partition circuit into channel connected gate components. • Determine node capacitances from layout analysis (accurate) or from wire-load model* (approximate). • Determine dynamic and static power from Spice for each gate. • Determine gate delays using Spice or Elmore delay model. * Wire-load model estimates capacitance of a net by its pin-count. See Yeap, p. 39.

Gate-Level Power Analysis (Cont.) • Run discrete-event (event-driven) logic simulation with a set of input vectors. • Monitor the toggle count of each net and obtain capacitive power dissipation: Pcap=ΣCk V 2f all nodes k • Where: • Ckis the total node capacitance being switched, as determined by the simulator. • V is the supply voltage. • f is the clock frequency, i.e., the number of vectors applied per unit time

Gate-Level Power Analysis (Cont.) • Monitor dynamic energy events at the input of each gate and obtain internal switching power dissipation: Pint = ΣΣ E(g,e) F(g,e) gates g events e • Where • E(g,e) = energy of event e of gate g, pre-computed from Spice. • F(g,e) = occurrence frequency of the event e at gate g, observed by logic simulation.

Gate-Level Power Analysis (Cont.) • Monitor the static power dissipation state of each gate and obtain the static power dissipation: Pstat = ΣΣP(g,s) T(g,s)/ T gates g states s • Where • P(g,s) = static power dissipation of gate g for state s, obtained from Spice. • T(g,s) = duration of state s at gate g, obtained from logic simulation. • T = vector period.

Gate-Level Power Analysis • Sum up all three components of power: P = Pcap+ Pint + Pstat

Switching Frequency Number of transitions per unit time: N(t) T = ─── t For a continuous signal: N(t) T = lim ─── t→∞ t T is defined as transition density.

Static Signal Probabilities • Observe signal for interval t 0 + t 1 • Signal is 1 for duration t 1 • Signal is 0 for duration t 0 • Signal probabilities: • p 1 = t 1/(t 0 + t 1) • p 0 = t 0/(t 0 + t 1) = 1 – p 1

Static Transition Probabilities • Transition probabilities: • T 01 = p 0 Prob{signal is 1 | signal was 0} = p 0 p1 • T 10 = p 1 Prob{signal is 0 | signal was 1} = p 1 p 0 • T = T 01 + T 10 = 2 p 0 p 1 = 2 p 1 (1 – p 1) • Transition density: T = 2 p 1 (1 – p 1)

Static Transition Frequency 0.25 0.2 0.1 0.0 f = p1(1 – p1) 0 0.25 0.5 0.75 1.0 p1

Inaccuracy in Transition Density p1 = 0.5 T = 1.0 1/fck p1 = 0.5 T = 4/6 p1 = 0.5 T = 1/6 Observe that the formula, T = 2 p1 (1 – p1), is not correct.

Cause for Error and Correction • Probability of transition is not independent of the present state of the signal. • Determine probability p 01 of a 0→1 transition. • Recognize p 01 ≠ p 0 × p 1 • We obtain p 1 = (1 –p 1)p 01 + p 1 p 11 p 01 p 1 = ───────── 1 –p 11 + p 01

Correction (Cont.) • Since p 11 + p 10 = 1, i.e., given that the signal was previously 1, its present value can be either 1 or 0. • Therefore, p 01 p 1 = ────── p 10 + p 01 This uniquely gives signal probability as a function of transition probabilities.

Transition and Signal Probabilities p01 = p10 = 0.5 p1= 0.5 1/fck p01 = p10 = 1/3 p1= 0.5 p01 = p10 = 1/6 p1= 0.5

Probabilities: p0, p1, p00, p01, p10, p11 • p 01 + p 00 =1 • p 11 + p 10 = 1 • p 0 = 1 – p 1 p 01 p 1 = ─────── p 10 + p 01

Transition Density • T = 2 p 1 (1 – p 1) = p 0 p 01 + p 1 p 10 = 2 p 10 p 01 / (p 10 + p 01) = 2 p 1 p 10 = 2 p 0 p 01

Power Calculation • Power can be estimated if transition density is known for all signals. • Calculation of transition density requires • Signal probabilities • Transition densities for primary inputs; computed from vector statistics

Signal Probabilities x1 x2 x1 x2 x1 x2 x1 + x2 – x1x2 1 - x1 x1

Signal Probabilities 0.5 x1 x2 x3 x1 x2 0.25 0.5 0.625 0.5 y = 1 - (1 - x1x2) x3 = 1 - x3 + x1x2x3 = 0.625 X1 X2 X3 Y 0 0 0 1 0 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 1 0 1 1 0 1 1 1 1 1 Ref: K. P. Parker and E. J. McCluskey, “Probabilistic Treatment of General Combinational Networks,” IEEE Trans. on Computers, vol. C-24, no. 6, pp. 668-670, June 1975.

Correlated Signal Probabilities 0.5 x1 x2 x1 x2 0.5 0.25 0.625? y = 1 - (1 - x1x2) x2 = 1 – x2 + x1x2x2 = 1 – x2 + x1x2 = 0.75 (correct value) X1 X2 Y 0 0 1 0 1 0 1 0 1 1 1 1

Correlated Signal Probabilities 0.5 x1 + x2 – x1x2 x1 x2 0.75 0.5 0.375? y = (x1 + x2 – x1x2) x2 = x1x2 + x2x2 – x1x2x2 = x1x2 + x2 – x1x2 = x2 = 0.5 (correct value) X1 X2Y 0 0 0 0 1 1 1 0 0 1 1 1

Observation • Numerical computation of signal probabilities is accurate for fanout-free circuits.

Remedies • Use Shannon’s expansion theorem to compute signal probabilities. • Use Boolean difference formula to compute transition densities.

Shannon’s Expansion Theorem • C. E. Shannon, “A Symbolic Analysis of Relay and Switching Circuits,” Trans. AIEE, vol. 57, pp. 713-723, 1938. • Consider: • Boolean variables, X1, X2, . . . , Xn • Boolean function, F(X1, X2, . . . , Xn) • Then F = Xi F(Xi=1) + Xi’ F(Xi=0) • Where • Xi’ is complement of X1 • Cofactors, F(Xi=j) = F(X1, X2, . . , Xi=j, . . , Xn), j = 0 or 1

Expansion About Two Inputs • F = XiXj F(Xi=1, Xj=1) + XiXj’ F(Xi=1, Xj=0) + Xi’Xj F(Xi=0, Xj=1) + Xi’Xj’ F(Xi=0, Xj=0) • In general, a Boolean function can be expanded about any number of input variables. • Expansion about k variables will have 2k terms.

Correlated Signal Probabilities X1 X2 X1 X2 Y = X1 X2 + X2’ X1 X2 Y 0 0 1 0 1 0 1 0 1 1 1 1 Shannon expansion about the reconverging input, X2: Y = X2 Y(X2 = 1) + X2’ Y(X2 = 0) = X2 (X1) + X2’ (1)

Correlated Signals • When the output function is expanded about all reconverging input variables, • All cofactors correspond to fanout-free circuits. • Signal probabilities for cofactor outputs can be calculated without error. • A weighted sum of cofactor probabilities gives the correct probability of the output. • For two reconverging inputs: f = xixj f(Xi=1, Xj=1) + xi(1-xj) f(Xi=1, Xj=0) + (1-xi)xj f(Xi=0, Xj=1) + (1-xi)(1-xj) f(Xi=0, Xj=0)

Correlated Signal Probabilities X1 X2 X1 X2 Y = X1 X2 + X2’ Shannon expansion about the reconverging input, X2: Y = X2 Y(X2=1) + X2’ Y(X2=0) = X2 (X1) + X2’ (1) y = x2 (0.5) + (1-x2) (1) = 0.5 (0.5) + (1-0.5) (1) = 0.75 X1 X2 Y 0 0 1 0 1 0 1 0 1 1 1 1

Example 0.5 Supergate 0.25 Point of reconv. 0.5 0.0 0.5 1.0 0.5 1 0 0.0 1.0 0.5 0.375 0.5 Reconv. signal Signal probability for supergate output = 0.5 Prob{rec. signal = 1} + 1.0 Prob{rec. signal = 0} = 0.5 × 0.5 + 1.0 × 0.5 = 0.75 S. C. Seth and V. D. Agrawal, “A New Model for Computation of Probabilistic Testability in Combinational Circuits,” Integration, the VLSI Journal, vol. 7, no. 1, pp. 49-75, April 1989.

Probability Calculation Algorithm • Partition circuit into supergates. • Definition: A supergate is a circuit partition with a single output such that all fanouts that reconverge at the output are contained within the supergate. • Identify reconverging and non-reconverging inputs of each supergate. • Compute signal probabilities from PI to PO: • For a supergate whose input probabilities are known • Enumerate reconverging input states • For each input state do gate by gate probability computation • Sum up corresponding signal probabilities, weighted by state probabilities

Calculating Transition Density 1 Boolean function x1, T1 . . . . . xn, Tn y, T(Y) = ? n

Boolean Difference ∂Y Boolean diff(Y, Xi) = ── = Y(Xi=1) ⊕ Y(Xi=0) ∂Xi • Boolean diff(Y, Xi) = 1 means that a path is sensitized from input Xi to output Y. • Prob(Boolean diff(Y, Xi) = 1) is the probability of transmitting a toggle from Xi to Y. • Probability of Boolean difference is determined from the probabilities of cofactors of Y with respect to Xi. F. F. Sellers, M. Y. Hsiao and L. W. Bearnson, “Analyzing Errors with the Boolean Difference,” IEEE Trans. on Computers, vol. C-17, no. 7, pp. 676-683, July 1968.

Transition Density n T(y) = Σ T(Xi) Prob(Boolean diff(Y, Xi) = 1) i=1 F. Najm, “Transition Density: A New Measure of Activity in Digital Circuits,” IEEE Trans. CAD, vol. 12, pp. 310-323, Feb. 1993.

Power Computation • For each primary input, determine signal probability and transition density for given vectors. • For each internal node and primary output Y, find the transition density T(Y), using supergate partitioning and the Boolean difference formula. • Compute power, P = Σ 0.5CY V2 T(Y) all Y where CY is the capacitance of node Y and V is supply voltage.

Transition Density and Power 0.2, 1 X1 X2 X3 0.06, 0.7 0.3, 2 0.436, 3.24 Ci Y CY 0.4, 3 Transition density Signal probability Power = 0.5 V 2 (0.7Ci + 3.24CY)

Prob. Method vs. Logic Sim. * CONVEX c240

Problem 1 For equiprobable inputs analyze the 0→1 transition probabilities of all gates in the two implementations of a four-input AND gate shown below. Assuming that the gates have zero delays, which implementation will consume less average dynamic power? E A B C D E A B C D F G G F Tree structure Chain structure

Problem 1 Solution Given the primary input probabilities, P(A) = P(B) = P(C) = P(D) = 0.5, signal and transition (0→1) probabilities are as follows: The tree implementation consumes 100×(0.4336 – 0.3555)/0.3555 = 22% more average dynamic power. This advantage of the chain structure may be somewhat reduced because of glitches caused by unbalanced path delays.

Problem 2 Assume that the two-input AND gates in Problem 1 each has one unit of delay. Find input vector pairs for each implementation that will consume the peak dynamic power. Which implementation consumes less peak dynamic power? E A B C D E A B C D F G G F Tree structure Chain structure

A B C D E F G A=11 B=10 E=10 C=11 F=10 D=01 G=00 Time units 0 1 2 3 Problem 2 Solution For the chain structure, a vector pair {A B C D} = {1110},{1011} will produce four gate transitions as shown below.

A B C D E G F A=11 B=10 E=10 C=11 D=10 F=10 G=10 Time units 0 1 2 3 Problem 2 Solution (Cont.) The tree structure has balanced delay paths. So it cannot make more than 3 gate transitions. A vector pair {ABCD} = {1111},{1010} will produce three transitions as shown below. Therefore, just counting the gate transitions, we find that the chain consumes 100(4 – 3)/3 = 33% higher peak power than the tree.

Power Estimation