280 likes | 357 Views
Ensieea Rizwani. Green Scheduling: A Scheduling Policy for Improving the Energy Efficiency of Fair Scheduler By : Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1. Motivation
E N D
EnsieeaRizwani Green Scheduling: A Scheduling Policy for Improving the Energy Efficiency of Fair Scheduler By: Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1
Motivation Reducing energy consumption of data centers is critical to cutting down operational costs as well minimizing its impact to the environment. On one hand, if performance per watt of server doesn’t improve, power cost could easily overtake hardware cost . On the other hand, CO2 emissions of global data centers will be up to 259 million tons by 2020 , which will accelerate global warming.
Outline • Introductions • Overview • Power conservation Mechanism • Structure • Simulation and Measurement • Conclusion • Related Work
In the last few years, a lot of effort has been devoted to improve the energy efficiency of data centers. • Hardware (efficient building block) • Reference to last presentation • Software Techniques At the software level, improve the energy efficiency of MapReduce. MapReduce has been the dominant framework deployed in data center for processing large data sets: by 2010, Google processed approximate 1000 PB of data daily using MapReduce [11]; Yahoo had 38000 servers running hadoop(an open-source implementation of MapReduce) in production [12]. So its energy efficiency promotion will benefit the data center's energy consumption reduction.
Data Center Fact The fact is that servers in data center are non-power proportional (the energy consumed is not proportional to the work completed). In our experiments, the slave consumes 54.5 W at idle and 87.5 W at peak utilization. For servers, their peak energy efficiency occurs at peak utilization and improves as utilization increases.
Management System of HPC MapReduce's energy efficiency is closely tied to its scheduler, we find that fair scheduler outperforms FIFO scheduler in energy efficiency when CPU-intensive job and IO-intensive job running simultaneously on the cluster, because fair scheduler achieves better resource utilization by overlapping resource complementary tasks on slaves. We propose an energy-efficient scheduling policy called green scheduling which relaxes fairness slightly to create as many opportunities as possible for overlapping resource complementary tasks. The results show that green scheduling can save between 7% and 9% energy consumption of fair scheduler.
We believe the energy saving is the result of the better resource utilization achieved by fair scheduler by overlapping CPU-intensive task and IO-intensive task on slaves. The two types of tasks are complementary : IO-intensive task causes CPU to be idle, letting CPU-intensive task run can increase CPU utilization. In contrast, the effect on I/O performance is opposite: CPU-intensive task leaves IO idle while IO-intensive task can keep IO busy.
Simulation to Validate we compare our cluster's CPU and IO utilizations under FIFO scheduler and fair scheduler when the CPU-intensive job Pi estimator and the IO-intensive job RandomWriter are running simultaneously on it. Experimental results are demonstrated in Figure 1. Under FIFO scheduler, CPU utilization fluctuates between 60% and 100% while IO utilization is below 10% until job Pi estimator finishes. But after job RandomWriter starts, CPU utilization drops dramatically and IO utilization increases significantly. In contrast, fair scheduler keeps both CPU and IO at high utilization over the two jobs' duration. Clearly, fair scheduler leads to better resource utilization than FIFO scheduler.
This motivates us to propose an energy efficient scheduling policy called green scheduling: when a slave asks for new task, if the loss of fairness is in permissible range, our scheduler will choose the job whose resource requirement is the most complementary to the slave's current resource utilization, maximizing the slave's utilization while having a minimal impact on fairness.
Priority The default scheduler in Hadoop is FIFO scheduler. All running jobs are sorted and queued according to their priority and submit time. Five priority levels are defined: • very high, • High • normal • low • very low When a slave is ready to accept a new task, FIFO scheduler always picks up the first job in the queue and assigns its required task to the slave. Note: UB Data center CCR, implements group priority
Starvation One drawback of FIFO scheduler is its poor response time. Let's look at a concrete example,: • Job iat time t duration: 3 days • Job j at time t+1 duration: 10 min Under FIFO scheduler, the response time of jobjis almost 433 times of its job duration. To address this problem, propose fair scheduler which assigns each job a certain share to avoid starving.
IV. GREEN SCHEDULING Fair scheduler is often more energy efficient than FIFO scheduler when complementary jobs are running simultaneously on the cluster. However, this scheduler itself does not take the slave's and tasks’ resource utilization into account when scheduling jobs. To investigate the opportunity to improve the energy efficiency of fair scheduler, we analyze slots allocation on one slave under FIFO and fair sharing.
D. Green Scheduling To achieve better energy efficiency, green scheduling takes into account slave’s resource utilization and task’s resource utilization when choosing which job should be scheduled next. However, this may violate the primary design goal of fair scheduler: fairness. To minimize the impact on fairness, we only consider slave’s resource utilization as an important factor of choosing job in two scenarios: both of the two jobs are needy and neither of them is needy. The justification is that the two jobs have got relative fair shares in the two scenarios. In the scenario where one job is needy and other one is not, the shares that two jobs have got are absolute unfair. Consequently, relaxing fairness in this scenario will aggravate unfairness.
A MapReduce job usually consists of a set of map tasks and reduces tasks. For simplicity, we only consider scheduling map tasks to achieve better utilization.
Conclusion This paperpresented a new scheduling policy called green scheduling to improve the energy efficiency of fair scheduler. Knowing the job’s resource requirement and slave’ resource utilization, green scheduling can create as many opportunities as possible for overlapping CPU-intensive task and IO-intensive task. The key insight it is that overlapping complementary tasks can achieve better energy efficiency as well as utilization. We perform an evaluation using different workloads that consist of CPU-intensive job and IO-intensive job, and the results show that fair sharing with green scheduling can reduce 7%-9% energy consumption over naïve fair sharing.
Related Work • Energy efficiency of Hadoop: Chen et al. [5] • Overlapping CPU-intensive job with IO-intensive job in scheduling: Overlapping CPU-intensive job with IO-intensive job leads to better resource utilization. Wiseman et al. [17]