410 likes | 524 Views
Minimizing Expected Energy Consumption in Real-Time Systems through Dynamic Voltage Scaling. Ruibin Xu , Daniel Mosse ’, and Rami Melhem. Problems Theme. Context: Frame-based hard real time systems Given one or more tasks with Same period deadline = period Order of execution of tasks
E N D
Minimizing Expected Energy Consumption in Real-Time Systems through Dynamic Voltage Scaling RuibinXu, Daniel Mosse’, and RamiMelhem
Problems Theme • Context: Frame-based hard real time systems • Given one or more tasks with • Same period • deadline = period • Order of execution of tasks • Probability distribution of execution cycles of each task • One processor with DVS support • Goal: Schedule tasks (time allocation, speed) to minimize expected energy consumption
Problems & System Models Problems Intra-Task DVS Inter-Task DVS Hybrid System Model Ideal Realistic
Problems • Intra-Task DVS • Only one task • Compute speed of each cycle or group of consecutive cycles. • Inter-Task DVS • Multiple Tasks and their order of execution • Compute fraction of remaining time to allot to each task • At run time, speed changes only at the boundary of a task • Hybrid • Combine Intra and Inter-task DVS
System Models • Ideal Model • Unrestricted continuous speed • No time or energy overhead for changing speed • Well defined power-frequency relation. p(f) = c0+c1f α • Realistic Model • Predefined set of discrete speeds • Changing speed costs time and energy overhead • No assumption on power-frequency relation
Intra-Task DVS + Ideal System • Minimize , where • Subject to • Optimal solution is • Algorithms: PACE, GRACE.
Intra-Task DVS + Realistic System • First approach: patch solution obtained under ideal system • GRACE: round speed up to closest discrete frequency • PACE: round speed up or down to closest discrete frequency • Problems? • Can miss deadline • Ignore speed change overhead • PACE: scan all phases and adjust speed, subtract maximum time penalty from allotted time
Intra-Task DVS + Realistic System • Second Approach: Design DVS considering the realistic system model (PPACE). • Change speed time penalty • Change speed energy penalty • Given r transition points, partition the range of execution cycles [1, W] into r+1 phases: [b0, b1-1], [b1, b2-1],…, [br, br+1-1], where b0=1, br+1=W+1 • Not necessarily the optimal partitioning!
Intra-Task DVS + Realistic System 1 cdf(x) …… f0 f1 f2 fr 0 X …… b0=1 b1 b2 b3 br *Graph borrowed from presentation by RuibinXu
Intra-Task DVS + Realistic System • Minimize • Subject to • Where
Intra-Task DVS + Realistic System • An energy-time label l is a 2-tuple (e,t), where e and t denote energy and time, respectively *Graph borrowed from presentation by RuibinXu
Intra-Task DVS + Realistic System v0 v1 v2 v3 |LABEL(0)|=1 |LABEL(1)|=M |LABEL(2)|=M2 |LABEL(3)|=M3 Exponential growth! *Animation borrowed from presentation by RuibinXu
Intra-Task DVS + Realistic System • Use Approximation • Approximations that preserve optimality but do not guarantee polynomial running time • Approximation based on factor є so that solution is within (1+є) of optimal, and we get a polynomial running time.
Intra-Task DVS + Realistic System v0 v1 v2 v3 |LABEL(1)|<<M, Hopefully! |LABEL(0)|=1 |LABEL(1)|=M |LABEL(2)|<<M2 Hopefully! |LABEL(2)|=M2 |LABEL(3)|=M3 *Animation borrowed from presentation by RuibinXu
β1D (1-β1)D D Inter-Task DVS + Ideal System slack Algorithm: OITDVS β1 *Animation borrowed from presentation by RuibinXu
β4=100% β3=xx% vs. β2=xx% vs. β1=xx% vs. Inter-Task DVS + Ideal System T1 T2 T3 T4 *Animation borrowed from presentation by RuibinXu
Inter-Task DVS + Realistic System • Use the ideal system to compute allotted fractions of system time to each task • Patch the solution to work on the realistic system: • Before computing the speed of a task, subtract from remaining time the maximum possible time penalty for all remaining speed changes. • Make sure a task runs at one of the discrete speed steps • Algorithms PITDVS, PITDVS2
Hybrid (Intra + Inter-Task DVS) + Ideal System • Combine intra and inter-task DVS. • Compute fractions per cycle per task, instead of just per task. • Algorithm: GOPDVS
Hybrid (Intra + Inter-Task DVS) + Realistic System • Two approaches • First (PGOPDVS): • Compute time allocation fraction per phase (instead of cycle) per task. • From the above fractions, compute fraction per task. • At run time, use task fractions to allot time to each task. • Compute task speed by applying patches as in inter-task DVS
Hybrid (Intra + Inter-Task DVS) + Realistic System • Second (PIT-PPACE): • Compute time allocation fraction per task (inter-task DVS). • At run time, use intra-task DVS to compute speed schedule of each task according to allotted time. • Because the above step is time consuming, we can compute a set of solutions of intra-task DVS for each task and apply the best one at run time.
Energy-Aware Scheduling for Streaming Applications on Chip Multiprocessors RuibinXu, RamiMelhem, Daniel Mosse’
Problem • Streaming applications operate on streams of data and are compute intensive. Examples: video streaming, automatic target recognition • A stream of data can be abstracted as a sequence of requests. • Thus a streaming application can be modeled as a periodic task in real-time systems. • QoS: Throughput (T), and Deadline (D)
Problem • Streaming applications are highly parallelizable and thus we can use CMPs to run them. • CMPs support: • Turning off cores to reduce leakage • DVS to reduce dynamic energy consumption • Goal: Schedule tasks so as to minimize energy consumption and meet QoS requirements.
Models • Application is modeled as a DAG where nodes represent tasks, and edges represent precedence relations and communication requirements. • Communication cost of transferring B bits • Delay is • Energy is • Each processor has M discrete frequency steps
Effect of Static Power • Power function of a processor core • Consider Y-oriented load only • Assume job consists of c cycles, and we use y cores, then each is assigned cycles, and runs with speed • Energy consumption is • To minimize energy consumption • Similar result for X-oriented load
Scheduling for Y-Oriented Load • Solution has recursive nature • Let denote optimal scheduling of tasks i through n, with end-to-end delay = t • Computes single Value of WCEC Running time q Energy i,i+1,..,j d t-d
Scheduling for Y-Oriented Load • We need to consider all M frequencies and all possible n-i+1 mappings of consecutive tasks i through n to first stage.
Scheduling for X-Oriented Load • Use List scheduling to map tasks to cores in the same stage. • Perform speed computation for tasks of the stage. • Use hill-climbing to improve solution.
Simulation Results • Comparing Schedling2D against baseline. • Baseline Algorithm • Period = deadline • Try all possible number of cores to find minimum energy consumption. • Use ETF heuristic to perform task mapping. • Uses convex programming approach to obtain execution speed of each task given task mapping.
Simulation Results • Percentage of static power in total power: • 70nm: 22% • 50nm: 44% • 35nm: 67% • As static power increases, energy savings obtained by Scheduling2D decreases
Simulation Results • As period increases, energy savings decrease • For a given period, increasing deadline will initially result in increased energy savings