520 likes | 658 Views
Prabal Dutta Rodrigo Fonseca Thoma Schmid. Energy Metering and Tracking with iCount and Quanto. IPSN 2009 Tutorial – San Francisco – April 16 th , 2009. quanto /’kwän to/ Portuguese (from Latin quantu )
E N D
Prabal Dutta Rodrigo Fonseca Thoma Schmid Energy Metering and Tracking with iCount and Quanto IPSN 2009 Tutorial – San Francisco – April 16th, 2009
quanto /’kwänto/ Portuguese (from Latin quantu) interrogative pronoun - how much, what amount, what quantity, what number, what price
Introductionstell us about yourselves prabal@eecs.berkeley.edu rodrigo.fonseca@gmail.com thomas.schmid@ucla.edu
Schedule • 1:00 – 2:30 Presentation • 2:30 – 3:00 Break • 3:00 – 5:00 Hands on • Goals: • Present the concepts behind Quanto • Get you excited by instrumenting, running, and analyzing simple applications
Outline • Demo: Blink • Introduction • How much energy? iCount • Principles • Calibration • What is using my energy? • Energy breakdown • Why is it using my energy? • Activity Tracking • Quanto in Practice • Architecture • Interesting Findings • Recording Information • Hands on Session
Blink: What’s happening? DedicatedResources 48 seconds of Blink LogicalThreads ofExecution SharedResources
Where have all the Joules gone? “Slice” by device 48 seconds of Blink “Track” by activity
module BlinkC () { uses interface Timer<TMilli> as Timer0; uses interface Timer<TMilli> as Timer1; uses interface Timer<TMilli> as Timer2; uses interface Leds; uses interface Boot; } Implementation { event void Boot.booted() { call Timer0.startPeriodic(250); call Timer1.startPeriodic(500); call Timer2.startPeriodic(1000); } event void Timer0.fired() { call Leds.led0Toggle(); } event void Timer1.fired() { call Leds.led1Toggle(); } event void Timer2.fired() { call Leds.led2Toggle(); } } module BlinkC () { uses interface Timer<TMilli> as Timer0; uses interface Timer<TMilli> as Timer1; uses interface Timer<TMilli> as Timer2; uses interface Leds; uses interface Boot; } Implementation { event void Boot.booted() { call CPUActivity.set(ACT_LED0); call Timer0.startPeriodic(250); call CPUActivity.set(ACT_LED1); call Timer1.startPeriodic(500); call CPUActivity.set(ACT_LED2); call Timer2.startPeriodic(1000); } event void Timer0.fired() { call Leds.led0Toggle(); } event void Timer1.fired() { call Leds.led1Toggle(); } event void Timer2.fired() { call Leds.led2Toggle(); } } Blink: Instrumentation
Real applications:Many services making concurrent use of same hardware Hardware MCU Radio Sensors Storage Power Supply “Trio Network” [Dutta06]
Energy metering Measure energy usage i(t) p(t) ∫p(t)dt Energy breakdown Slice usage horizontally Allocate usage to energy sinks Activity tracking Dice usage vertically Track causal connections Three basic challenges iCount iCount+P-states+ Regression Labels
Outline • Demo: Blink • Introduction • How much energy? iCount • Principles • Calibration • What is using my energy? • Energy breakdown • Why is it using my energy? • Activity Tracking • Quanto in Practice • Architecture • Interesting Findings • Toolchain • Hands on Session
Energy metering Measure energy usage i(t) p(t) ∫p(t)dt Energy breakdown Slice usage horizontally Allocate usage to energy sinks Activity tracking Dice usage vertically Track causal connections Three basic challenges iCount iCount+P-states+ Regression Labels
Measuring: wide horizontal/vertical dynamic range 86,400,000 ms 640,000 ms [Farkas00] TX packet at 1% duty cycle (20 ms / 2 s) 4,000 ms 30 ms
Dynamic range in power draw exceeds 10,000:1 > 50 mW < 1 µW
Current energy metering techniques are inadequate cumbersome, expensive, not distributed, not scalable, not embedded, low resolution cumbersome, expensive, not distributed, not scalable, not embedded, DS2438 ADM1191 BQ2019 BQ27500 [Jiang07] low resolution, low responsiveness, high quiescent power low responsiveness, high cost, high quiescent power
Key insight: Switching regulators inherently meter energy If your platform has a PFM switching regulator…(many do) add a wire iCountenergymeterdesign
How does it work? E=½Li2 Lx PFM Regulator Vout Vin VLX S2 iLX Vin Cin S1 Rload Cout Energize Transfer Monitor Source: Maxim Semiconductor
Each cycle transfers a fixed energy quanta to the load Regulator Cycles ΔE=½Li2 P=ΔE/Δt Counting cycles translates to measuring energy
This simple design works surprisingly well MAX1724 Prototype implementation [Senseweb](WSU) [Quanto+](UCLA) [HydroWatch](UCB) [Benchmark](UCB) [Quanto](UCB)
Hardware costs: wire and counter “wire” Counter HydroSolar Node (v2)
iCount Performance summary * Frequency averaged over 1 second
Outline • Demo: Blink • Introduction • How much energy? iCount • Principles • Calibration • What is using my energy? • Energy breakdown • Why is it using my energy? • Activity Tracking • Quanto in Practice • Architecture • Interesting Findings • Toolchain • Hands on Session
Energy metering Measure energy usage i(t) p(t) ∫p(t)dt Energy breakdown Slice usage horizontally Allocate usage to energy sinks Activity tracking Dice usage vertically Track causal connections Three basic challenges iCount iCount+P-states+ Regression Labels
Slicing: breaking down the envelope into its parts “Itsy” Marc A. Viredaz and Deborah A. Wallach, “Power Evaluation of a Handheld Computer”, IEEE Micro, Jan-Feb, 2003
Not all energy sinks can be instrumented Power MCU USART CPU OSC ADC DMA Timer PA LNA Radio Flash Sensors LEDs RX TX Control Power Data
A different approach to energy slicing: power state tracking* On Off Export device power states Through a narrow interface OS tracks/logs state transitions * H. Zeng et al. “ECOSystem: Managing Energy as a First Class Operating Systems Resource”, ASPLOS’02, 2002.
Estimate energy breakdowns with regression • For every power state transition • Snapshot system-wide power states (α1,…, αn) • Snapshot global energy usage (ΔE) • Snapshot system clock (Δt) • Generate an equation of the form ΔE/Δt = α1p1 +… +, αnpn (p’s are the unknown power draws) • Solve for p’s using weighted multivariate least squares ΔE α’s pi Δt High-resolution, high-speed energy meter key for good results
Example power state equations ΔE α’s pi ΔE/Δt = α1p1 + α2p2 + α3p3 + α4p4 + α5p5 2/1 = 1·p1 + 1·p2 + 1·p3 + 1·p4 + 0·p5 3/1.1 = 1·p1 + 1·p2 + 1·p3 + 1·p4 + 1·p5 1/0.4 = 1·p1 + 0·p2 + 1·p3 + 1·p4 + 1·p5 0/0.4 = 1·p1 + 0·p2 + 0·p3 + 1·p4 + 1·p5 1/0.6 = 1·p1 + 0·p2 + 0·p3 + 0·p4 + 1·p5 0/0.6 = 1·p1 + 0·p2 + 0·p3 + 0·p4 + 0·p5 Y = ΔE/Δt AP = Y P = A-1Y W = diag( √(Δt*ΔE) ) P = (ATWA)-1ATWY
Outline • Demo: Blink • Introduction • How much energy? iCount • Principles • Calibration • What is using my energy? • Energy breakdown • Why is it using my energy? • Activity Tracking • Quanto in Practice • Architecture • Interesting Findings • Toolchain • Hands on Session
Energy metering Measure energy usage i(t) p(t) ∫p(t)dt Energy breakdown Slice usage horizontally Allocate usage to energy sinks Activity tracking Dice usage vertically Track causal connections Three basic challenges iCount iCount+P-states+ Regression Labels
Itsy measured Breakdown by subsystem For each application PowerScope measured: Breakdown by PC Breakdown by PID Tracking: gap between what is measured and what matters Marc A. Viredaz and Deborah A. Wallach, “Power Evaluation of a Handheld Computer”, IEEE Micro, Jan-Feb, 2003 Jason Flinn and M. Satyanarayanan, “Energy-Aware Adaptation for Mobile Apps.”, SOSP’99, Kiawah Island, SC, 1999
What’s wrong with subsystems, PCs, and PIDs? • Subsystem • No distinction between different logical activities • Is radio sending a routing beacon or data packet? • Program Counter • Pinpoints code hotspots • But, not the data on which the code operates • Routing beacon? Time sync packet? Data packet? • Process ID • Most sensornet applications are I/O bound • Most energy is spent outside the CPU • For I/O-bound processes, PID-based sampling is biased
Activities* are what actually matters • A causally-connected set of operations… • whose distinct resource consumptions… • should be grouped together for accounting* * M.B. Jones et al., “Modular Real-Time Resource Management in the Rialto Operating System”, HotOS’95, 1995. G. Banga et al. “Resource Containers: A New Facility for Resource Management in Server Systems”, OSDI’99, 1999. H. Zeng et al. “ECOSystem: Managing Energy as a First Class Operating Systems Resource”, ASPLOS’02, 2002.
Three steps to activity tracking • Annotating • Any abstraction can introduce an annotation • Associates an activity “label” with an execution • Labels are < origin-node : activity-identifier > pairs • Propagating • System software transfers activity labels • Across subsystems, nodes, and deferred computations • Recording • Track, log, and post-process resource usage
Annotating an activity “paints” causally-connected actions • “Sensing” involves • Sensor... • CPU, ADC, I2C bus, … • “Storing” involves • Flash… • CPU, SPI bus, timers, … • Initiate annotation with CPUActivity.set(<label>) • Quanto automatically propagates labels Node A CPU Sensor Flash Act: sensing Act: storing ... CPUActivity.set(ACT_SENSING); Sensor.read(); ... CPUActivity.set(ACT_STORING); Flash.write(...); ...
Propagating labels over deferred computations • Examples • CPU Post task (deferred function call) CPU • CPU Queue object CPU • Task Scheduler • Add activity field to task structure • Set activity field on task posting • Restore activity on task invocation • Queue • Tag each entry with its activity label • Write activity label on enqueue • Restore activity label on dequeue
Add hidden field to packet Sender’s OS sets activity field Propagating labels over the network CPU Node B Flash Radio CPU Sensor Node A Radio Act: sensing Act: sending Proxy Rx activity Packet Tx Radio.send(message_t* msg) { . . . msg->header->activity = CPUActivity.get(); . . . }
Every interrupt causes energy consumption before activity label is identified Interrupt CPU Timer CPU Radio CPU Proxy activity provides ephemeral label Binding with real activity occurs when label is clear Propagating unknown labels using proxy activities CPU Node B Flash Radio CPU Sensor Node A Radio Act: sensing Act: sending Proxy Rx activity Packet Tx message_t* Radio.recv(message_t* msg, void* payload, uint8_t len) { . . . CPUActivity.bind(msg->hdr->activity); . . . }
Concurrent activities on shared resources Timer.start Timer.fired Activity A Timer.start Timer.fired Activity B Timer Power State A A/B B Timer Activities Time Add A Rem B Add B Rem A async command void Timer.start() { call TimerActivity.add(call CPUActivity.get()); ... } async event void HardwareTimer.fired() { signal Timer.fired(); call TimerActivity.remove(call CPUActivity.get()); ... }
Energy metering Measure energy usage i(t) p(t) ∫p(t)dt Energy breakdown Slice usage horizontally Allocate usage to energy sinks Activity tracking Dice usage vertically Track causal connections Three basic challenges iCount iCount+P-states+ Regression Labels
Outline • Demo: Blink • Introduction • How much energy? iCount • Principles • Calibration • What is using my energy? • Energy breakdown • Why is it using my energy? • Activity Tracking • Quanto in Practice • Architecture • Interesting Findings • Recording Information • Hands on Session
Summary of Quanto energy profiling architecture Application annotate code with activity labels log and process activity/power data <PowerStateTrack>, <SingleActivityTrack>, <MultiActivityTrack>, <EnergyMeter> Operating System propagate labels to/from devices propagate labels over deferred computations <PowerState>, <SingleActivity>, <MultiActivity> Device Drivers save/restore/expose activity monitor/expose power states Hardware meter energy usage
Interesting Findings: catching power bugs.Why is TIMERA firing at 16Hz?!?
How much time and energy does using DMA really save? Using DMA can subvert MAC layer fairness
What’s the cost of false alarms in Low-Power Listening? D RX Tlisten Preamble D Noise TX Overhearing adds significant unpredictability to node lifetime
Recording: log, export, and post-process data • Challenge is getting data off the node • Reason most applications are still toys Continuous: Parallel Out + Ethernet Burst: 10KB RAM Log + UART Out Burst 128K FIFO Log + UART Out Continuous: Compression + UART Out
Logging Solutions • Log to RAM: • 12 bytes per logged event • Buffer for 700 events • Dump to UART once buffer is full • Log to parallel port • Fast, real-time logging • Not scalable to multiple nodes • Log to the UART, compressed • Blink running at 50ms, no compression: logging takes 80.9% of CPU time • With compression: 25% of CPU time, compression of 3.84X (each log entry ~ 3.1 bytes) • Not enough for larger applications • RadioCountToLeds, 2 nodes @100ms: • 427 ev/s • With LPL: 1617 ev/s
Logging Solutions • Don’t trace: count • Step 1: online accounting of activities • Time per activity per resource • Almost ready • Step 2: online regression • Accumulating power state info • Doing regression online • Current Status • The next release of Quanto will have online accounting of the activity times
Hands-on Section • Instrument, record data, process, visualize • Basic: Blink • Cross-network: RadioCountToLeds • If we have time: • LPL: Bounce • Another example: FTSP