1 / 47

Estimating the Worst-Case Energy Consumption of Embedded Software

Estimating the Worst-Case Energy Consumption of Embedded Software. Ramkumar Jayaseelan Tulika Mitra Xianfeng Li School of Computing National University of Singapore. Motivation. Conventional scheduling techniques give timing guarantees Processor cycles is the critical resource

von
Download Presentation

Estimating the Worst-Case Energy Consumption of Embedded Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Estimating the Worst-Case Energy Consumption of Embedded Software Ramkumar Jayaseelan Tulika Mitra Xianfeng Li School of Computing National University of Singapore

  2. Motivation • Conventional scheduling techniques give timing guarantees • Processor cycles is the critical resource • WCET of the tasks are required input • Battery life is equally important for mobile devices • Scheduling technique have to give energy guarantees • Worst-Case Energy Consumption (WCEC) of the tasks are required input

  3. Remotely Deployed Systems • Available energy unevenly distributed among nodes • Spatio-temporal scheduling benefits from WCEC Local Station Sensor Network

  4. Energy-Based Guarantees • Scheduling critical and non-critical tasks in a battery-operated system • Non-critical tasks can be run only if energy constraints for critical tasks are satisfied • Worst-case energy estimation is crucial

  5. Reward-Based Scheduling • Energy consumption  Voltage • Delay  (1 / Voltage) • Reward-based scheduling attempts to satisfy constraints on energy and timing • Energy guarantee only if worst-case energy consumption of tasks are known

  6. Outline • Background • Relation between WCET and Worst-case energy consumption • Estimation technique: Simplified model • Instruction cache and speculation • Experimental results • Conclusion

  7. Background • Power and energy are often used interchangeably • Power is energy consumed per unit time • Energy consumed during program execution E = P × t • Approximation as P is also a function of time

  8. Power Time E=P×T is an approximation • In reality when a program executes • Energy is the area under the curve E = ∫P(t)dt

  9. WCEC versus WCET Full Input Space Expansion for a 5-element Insertion Sort program

  10. Cannot Estimate WCEC from WCET Possible underestimation using WCEC=WCET × power

  11. WCEC versus WCET • WCEC path need not be the same as the WCET path • WCEC cannot be directly estimated from the WCET value

  12. A closer look at Power • Dynamic Power : Power Consumption due to switching of transistors • Leakage Power: Power consumed independent of switching activity • Dynamic power forms the bulk of power consumption in today’s processors

  13. Dynamic Power • Dynamic Power P=(1/2) × A × V2 × C × f V is supply voltage C is the capacitance of the circuit f is the frequency A is the activity factor • V, C, f are independent of program execution • Variation in P is due to the variation in A

  14. Variation in Activity Factor (A) • Not all parts of the processor are used in every cycle • e.g., data-cache is used only for loads/stores • Clock gating disables unused components • Activity factor (A) varies during the execution of the program • Model variation in A through static analysis

  15. Switch-off Energy • An inactive component cannot be fully switched off • A certain portion of the peak energy is consumed even in idle cycles • Switch-off energy is proportional to the number of idle cycles

  16. Clock Energy and Leakage Energy • Clock power: power consumed in clock distribution network • Leakage power: power consumed due to leakage in transistors • Clock energy and leakage energy are directly proportional to the execution time

  17. Energy Components Summary • Dynamic Energy • Switching of transistors during execution • Independent of execution time • Switch-off Energy • Energy consumed in unused components • Depends on idle cycles • Clock and Leakage energy • Directly proportional to execution time

  18. WCEC versus WCET Full Input Space Expansion for a 5-element Insertion Sort program

  19. Our Analysis: Overview • Operate on the control flow graph • Estimate worst-case energy of basic blocks • Formulate estimation for whole program as an integer linear programming (ILP) problem

  20. ILP Formulation • Input: Control flow graph of the program • Objective function: • Need to estimate Worst-Case Energy Consumption( WCECB) for each basic block Worst Case Energy =  WCECB  countB

  21. Flow Constraints Inflow = Basic Block Execution Count = Outflow Bounds on maximum loop iterations E0,1 = B0 = 1 E2,3 +E1,3 = B3 = 1 E0,1 +E2,1 = E1,2 +E1,3 = B1 E1,2 = E2,3 +E2,1= B2 Loop bound: E2,1 <= 100 B0 B1 B2 B3

  22. Worst-Case Energy of a Basic Block • Processor Model • Energy Components • Instruction Specific Energy • Pipeline Specific Energy

  23. ROB Processor Model IF I+1 IBUF I ID I-1 I-4 ISSUE EX I-2 I-3 WB ALU MULT CM FPU

  24. IF ID IS EX WB CM IF ID IS EX WB CM ADD SUB Pipelined Execution of Instructions ADD R1,R2,R3 MUL R4,R5,R6 SUB R7,R8,R9 CC 1 2 3 4 5 6 7 8 IF ID IS EX WB CM MUL Difficult to statically predict the energy consumption in each cycle

  25. IF ID IS EX WB CM IF ID IS EX ADD SUB Stall Stall Pipelined Execution of Instructions ADD R1,R2,R3 MUL R4,R5,R6 SUB R7,R8,R9 CC 1 2 3 4 5 6 7 8 IF ID IS EX WB MUL Difficult to statically predict the energy consumption in each cycle

  26. Our Approach • Determine the maximum energy consumed on a component by component basis • Static analysis to determine the maximum energy consumed by a component in a specified interval

  27. Execution of Instruction IF ID ISSUE EX WB CM

  28. Instruction Specific Energy • Energy consumed due to the sub-tasks associated with execution of an instruction • e.g., register file access, ALU usage, etc. • Depends on the type of executed instruction • No correlation with execution time

  29. Pipeline Specific Energy • During program execution energy is consumed due to • Switch-off power (idle cycles) • Leakage power (every cycle) • Clock network power (every cycle) • Cannot be attributed to any instruction • Energy consumed even in idle cycles

  30. Energy Components • Observation: Energy consumed can be separated out as • Instruction Specific energy • Energy associated with the execution of a particular instruction • Independent of execution time • Pipeline Specific energy • Energy consumed in other components such as clock network, leakage etc. • Related to execution time

  31. Worst-case Energy of a Basic block • dynamicBB: Instruction-Specific Energy for BB • switchoffBB , leakageBB and clockBB are energy consumed in unused components, leakage and clock network during WCETBB

  32. Instruction Specific Energy • Energy consumed due to switching activity generated by the instructions in BB • Sum of energy consumed by individual instructions in BB

  33. Switch-off Energy • Unused units consume 10% of peak energy • Switch-off energy for a specific component (C) • Switch-off energy for basic block BB

  34. Clock Energy and Leakage Energy • Clock Energy • Leakage Energy

  35. Overlap among basic blocks Time t1 B1 B2 B1 t2 t3 BB WCETBB t4 B3 t5 B3

  36. Switch-off Energy • Unused units consume 10% of peak energy • Switch-off energy for a specific component (C) • Switch-off energy for basic block BB

  37. Instruction Cache Modeling • Context based ILP formulation used in WCET analysis [Li et al RTSS 2004] • Basic block divided into memory blocks • A context comprises of mapping each of these memory blocks to hit/miss • Estimate the worst-case energy of each context taking into account main memory access energy

  38. Time t1 BB’ BB’ t2 BX t3 BB BX BB Modeling Branch miss-prediction

  39. Objective function • count(c,ω) is the number of times the basic block Bi is executed with path from Bj and the branch is predicted correctly • count(m,ω) is similarly defined where the branch is miss-predicted • In a similar manner energy(c,ω) and energy(m,ω) are defined • The ILP problem is solved to generate values for count using constraints similar to WCET analysis

  40. Results • Platform: Simplescalar toolset • Modified WCET analysis tool [Li et al RTSS 2004] to estimate worst-case energy • Energy values for processor components derived from parameterized models in Wattch • ILP problem is solved using CPLEX

  41. Results • Compare estimated WCEC against the observed values for eleven benchmarks • Observed values are obtained using Wattch power simulator • Actual inputs producing WCEC is unknown • Manually select inputs that might produce WCEC

  42. Styles of Clock Gating • Simple: Peak power is consumed even if there is one access to a specific component • Ideal : Power consumed is proportional to the number of ports accessed • Realistic: Same as ideal but unused components consume switch-off power

  43. Results Simple Clock Gating Ideal Clock Gating • Results for ideal clock gating more accurate than simple because of distribution of accesses

  44. Results for ideal clock gating more accurate than realistic because of conservative WCET estimation Results Realistic Clock Gating Ideal Clock Gating

  45. Conclusion • Static worst-case energy estimation technique that takes into account pipelining, instruction cache and branch prediction • Future work • Validation using commercial processors • Explore the possibility of providing thermal guarantees

  46. Execution of an Add Instruction I-Cache Access ADD IF Instruction Decode + Rename Logic ADD ID Wakeup + Selection logic ADD ISSUE Register File Read + Add unit access ADD EX ADD Result Bus WB ROB-retire + Register file Update ADD CM

  47. Instruction Specific Energy • Each Component Accessed once • Selection logic maybe accessed multiple times • Instruction Specific Energy is

More Related