An Instruction-Level Functionality-based Energy Estimation Model for 32-bits Microprocessors

An Instruction-LevelFunctionality-based Energy Estimation Model for32-bits Microprocessors C.Brandolese, W.Fornaciari, F.Salice, D.Sciuto June 5-9, 2000 CEFRIEL Research Center Italy Politecnico di Milano Italy Presenter: William Fornaciarifornacia@elet.polimi.it

Presentation outline • Problem definition • Model for functional decomposition • Single processor model • Multi processor model • Experimental results • Identification of functionalities • Instruction characterization • Generalization properties • Concluding remarks

Problem formulation • Pervasive use of mPs, makes the analysis of software power consumption a major need • Approaches to the mP power characterization • Architectural: block analysis of the mP structure • Instruction-level measurement-based • Problems • poor knowledge of the mP internals • complex and time-consuming measurement requirements • processor dependent approaches • lack of formal statistical models

Goals • Definition of a formal instruction-level model, based on a functional decomposition of the mP activities • Intra-processor generalization • Capability to cope with incomplete instruction Set (IS) power characterization • Inter-processor generalization • Capability to predict power characteristics of a mP based on the data of other mPs and its IS • Simplification of the measurements • Statistical validation of the measurement data

The model: definitions • Functionality Fi: set of activities involving -partially or totally- one or more mP units • Space-disjoint:if Fi and Fj involve different units • Time-disjoint:if Fi and Fj are accomplished at different times • Compatiblemodel: A set of functionalities forms a compatible model iff the current is absorbed by each instruction can be expressed as a linear combination of the currents ifj associated with FJ is = j=1..k (ifj as,j) • The coefficients as,j characterize each instruction in terms of the different functionalities • Adherence to physical reality imposes ifj > 0

The model: single processor • The energy esabsorbed by instruction s is: es = j=1..k es,j = Vdd  is  nck,s  t • Let IN = {isnck,s} IN = AIF + R • Solving this equation in the least square sense, gives estimate of the functionality currents IFest= IF + A*R where A* = (A' A)-1 A'

The model: single processor (cont’d) • A statistical analysis proves the formal correctness • Applicability to the available measures can be justified through a Z0.95 or Z0.99 test • The linearity of the model and its statistical properties allows to • Derive the model parameters IFest from a limited set of instruction measures (learning-set) • Extrapolate the energy of other uncharacterized instructions (generalization-set)

The model: multiple processors • For the same type of instruction • the absolute current strongly depends on the considered mP, • the relative currents (irel,s) are nearly independent of the mP • A general model based on a set P of p mPs for learning has been defined using the irel,sdefined as: irel,s = is/iref = j=1..k ifrel,j as,j k = # of functionalities considered • For the generic q-th mP of P, characterized byINrel,q={is,relnck,s} and Aq INrel,q= AqIFrel,q+ Rrel,q

The model: multiple processors • Solving the above equation in the least square sense and considering that the model should depend on a general set of parameters IFrelinstead of microprocessor-specific: IFest rel,q = Aq*INrel,q = = IFrel+Aq*Rrel,q • Adding up the relations for all the p mP of the set (1/p)q=1..p IFest rel,q = = IFrel+(1/p)q=1..p Aq*Rrel,q

The model: multiple processors • Result: average of the estimated parameters of each mP composing P is an adequate estimator for the parameters of the general model • Statistical analysis proves the formal correctness of this choice • Increasing the number of considered processors the variance of the parameters decreases as: VAR[IFest] = (1/p2)q=1..p VAR[IFest,rel,q]

Functionality identification • Identification of the functionalities: compromise between accuracy and specific knowledge on mP architecture • Selection of five functionalities, based on measured power consumption figures • Principal component analysis shows that no functionality can be neglected without affecting model accuracy

Functionality identification • The values of as,j have to be determined according to the identified functionalities • Example: decomposition in F&D and Exec as,F&DifF&D + as,ExecifExec = isnck,s • where, a reasonable choice for as,F&D and as,Exec is: as,F&D=nck,s,F&D, as,Exec=nck,s,Exec • General case: Fj is involved in the execution of the instruction s if the activation coefficientsbs,j=1 • The relation between bs,jand as,jis

Estimation on a single processor • The stimulated functionalities depend on the operation class and the addressing mode

Estimation on a single processor • 6 mPs considered, with similar results • ARM7, StrongARM, i960JF, i960HD, SPARC, i80486DX • Example I486DX. Gaussian noise Hypothesis holds:mR=9.94 falls in the range z0.95=40. • Comparison with 18 energy characterized instructions

Generalization on single processor • Procedure • Select learning-set from characterized instructions • Estimate the currents of the remaining instructions (generalization set) • Compute the learning and generalization errors • Errors tend to compensate (mean ~10-10): absolute error values have been used to assess the methodology accuracy, that is always < 9%

Generalization over different mPs • Combinations of mPs are used to generate different mP learning-sets, whose estimated model parameters allow prediction of the parameters of other mPs • According to the theoretical model, general mean error decreases as the mP learning-set size increases, and similarly the variance

Generalization over different mPs • Example: estimation of model parameters for the i486 based on those of SPARClite and ARM7TDMI

Concluding remarks • Identification of a widely applicable general model to estimate power consumption of 32-bits mPs • Single mP: copes with partially characterized Instruction Sets • Multiple mPs: considering relative currents, allows prediction of the characteristics of a mP capitalizing the knowledge of other characterized mPs • Multiple mPs: if needed, the current per clock cycle of a singlespecific instruction might be used as reference to compute absolute values

Concluding remarks (cont’d) • Soundness of the model has been proved both formally and experimentally • Six mPs have been considered • The model deals with static aspects of the power consumption of each instruction • The presence of dynamic effects has to be added according to other models considering higher-level features of code execution

An Instruction-Level Functionality-based Energy Estimation Model for 32-bits Microprocessors

An Instruction-Level Functionality-based Energy Estimation Model for 32-bits Microprocessors

Presentation Transcript

Model-Based Estimation of Streaming Performance

An Alternative Framework for Task-based Instruction: Core

A Run-Time Feedback Based Energy Estimation Model for Embedded Systems

AccuPower: An Accurate Power Estimation Tool for Superscalar Microprocessors*

SQUID Based Quantum Bits

An Experience-Based Model for Accelerating Coursework

Instruction-Level Parallelism

Model Reduction for Parameter Estimation

MicroTESK : An ADL-Based Reconfigurable Test Program Generator for Microprocessors

An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Instruction Level Parallelism

Using Parcel Level Data for an Activity-Based Tour Model

Instruction Level Parallelism

Component-Level Energy Consumption Estimation for Distributed Java-Based Software Systems

THE STRATEGIC INSTRUCTION MODEL AN OVERVIEW

Reconfigurable Model-Based Test Program Generator for Microprocessors

Instruction Level Parallelism

Instruction Level Parallelism

Transitioning to an Online model for Library Instruction

Instruction-level Parallelism

Model based design keystroke level model

BITS Pilani btech Direct Admission 32