1 / 47

Research on Embedded Hypervisor Scheduler Techniques

Research on Embedded Hypervisor Scheduler Techniques. 2014/10/02. Background. Asymmetric multi-core is becoming increasing popular over homogeneous multi-core systems. An asymmetric multi-core platform consists of cores with different capabilities. ARM: big.LITTLE architecture.

piera
Download Presentation

Research on Embedded Hypervisor Scheduler Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research on Embedded Hypervisor Scheduler Techniques 2014/10/02

  2. Background • Asymmetric multi-core is becoming increasing popular over homogeneous multi-core systems. • An asymmetric multi-core platform consists of cores with different capabilities. • ARM: big.LITTLE architecture. • Qualcomm: asynchronous Symmetrical Multi-Processing (aSMP) • Nvidia: variable Symmetric Multiprocessing (vSMP) • …etc.

  3. Motivation • Scheduling goals differ between homogenous and asymmetric multi-core platforms. • Homogeneous multi-core: load-balancing. • Distribute workloads evenly in order to obtain maximum performance. • Asymmetric multi-core: maximize power efficiency with modest performance sacrifices.

  4. Motivation(Cont.) • Need new scheduling strategies for asymmetric multi-core platform. • The power and computing characteristics vary from different types of cores. • Take the differences into consideration while scheduling.

  5. Project Goal • Research on the current scheduling algorithms for homogenous and asymmetric multi-core architecture. • Design and implement the hypervisor scheduler on asymmetric multi-core platform. • Assign virtual cores to physical cores for execution. • Minimize the power consumption with performance guarantee.

  6. Hypervisor Architecture with VMI Low computing resource requirement High computing resource requirement GUEST1 GUEST2 GUEST2 [1|0] [0|1] Task 3 Task 1 [1|0] [0|1] Task 4 Task 2 Android Framework Android Framework Android Framework Linaro Linux Kernel OS Kernel OS Kernel Scheduler Scheduler Scheduler VM Introspector gathers task information from Guest OS Treat thisvCPU as LITTLE core since tasks with low computing requirement are scheduled here. VCPU VCPU VCPU VCPU VCPU VCPU VCPU Hypervisor Task-to-vCPUMapper Hypervisor vCPUscheduler will schedule big vCPUto A15, and LITTLE vCPU to A7. Modify the CPU mask of each task according to the task information from VMI VM Introspector b-L vCPU Scheduler ARM Cortex-A15 ARM Cortex-A7 Performance Power-saving

  7. Hypervisor Scheduler • Assigns the virtual cores to physical cores for execution. • Determines the execution order and amount of time assigned to each virtual core according to a scheduling policy. • Xen - credit-based scheduler • KVM - completely fair scheduler

  8. Virtual Core Scheduling Problem • For every time period, the hypervisor scheduler is given a set of virtual cores. • Given the operating frequency of each virtual core, the scheduler will generate a scheduling plan, such that the power consumption is minimized, and the performance is guaranteed.

  9. Scheduling Plan • Must satisfy three constraints. • Each virtual core should run on each physical core for a certain amount of time to satisfy the workload requirement. • A virtual core can run only on a single physical core at any time. • The virtual core should not switch among physical cores frequently, so as to reduce the overheads.

  10. Example of A Scheduling Plan • x: physical core idle Execution Slice t100 t4 t1 t2 t3 V4 V4 x x … x Core0 x V3 x … x V3 Core1 V2 V1 V4 … V4 V2 Core2 V1 V3 V2 V1 … V1 Core3

  11. Three-phase Solution • [Phase 1] generates the amount of time each virtual core should run on each physical core. • [Phase 2] determines the execution order of each virtual core on a physical core. • [Phase 3] exchanges the order of execution slice in order to reduce the number of core switching.

  12. Phase 1 • Given the objective function and the constraints, we can use integer programming to find ai,j. • ai,j: the amount of time slices virtual core i should run on physical core j. • Divide a time interval into time slices. • Integer programming can find a feasible solution in a short time when the number of vCPUs and the number of pCPUs are small constants.

  13. Phase 1(Cont.) • If the relationship between power and load is linear. • Use greedy instead. • Assign virtual core to the physical core with the least power/instruction ratio and load under100%.

  14. Phase 2 • With the information from phase 1, the scheduler has to determine the execution order of each virtual core on each physical core. • A virtual core cannot appear in two or more physical core at the same time.

  15. Example vCPU2 (10,10,20, 20) vCPU0 (50,40,0, 0) vCPU1 (20,20,20, 20) vCPU5 (0, 0,10, 10) vCPU3 (10,10,20, 20) vCPU4 (10,10,10, 10) t=100 t=0

  16. Phase 2(Cont.) • We can formulate the problem into an Open-shop scheduling problem (OSSP). • OSSP with preemption can be solved in polynomial time. [1] [1] T. Gonzalez and S. Sahni. Open shop scheduling to minimize finish time. J. ACM, 23(4):665–679, Oct. 1976.

  17. After Phase 1 & 2 • After the first two phases, the scheduler generates a scheduling plan. • x: physical core idle Execution Slice t100 t4 t1 t2 t3 V4 V4 x x … x Core0 x V3 x … x V3 Core1 V2 V1 V4 … V4 V2 Core2 V1 V3 V2 V1 … V1 Core3

  18. Phase 3 • Migrating tasks between cores incurs overhead. • Reduce the overhead by exchanging the order to minimize the number of core switching.

  19. Number of Switching Minimization Problem • Given a scheduling plan, we want to find an order of the execution slice, such that the cost is minimized. • An NPC problem • Reduce from the Hamilton Path Problem. • Propose a greedy heuristic.

  20. Example #switching = 0

  21. Example #switching = 0

  22. Example #switching = 0

  23. Example #switching = 0

  24. Example #switching = 0

  25. Example #switching = 0

  26. Example #switching = 1

  27. Example #switching = 7

  28. Evaluation • Conduct simulations to compare the power consumption of our asymmetry-aware scheduler with that of a credit-based scheduler. • Compare the numbers of core switching from our greedy heuristic and that from an optimal solution.

  29. Evaluation • Conduct simulations to compare the power consumption of our asymmetry-aware scheduler with that of a credit-based scheduler. • Compare the numbers of core switching from our greedy heuristic and that from an optimal solution.

  30. Environment • Two types of physical cores • power-hunger “big” cores • frequency: 1600MHz • power-efficient “little” cores • frequency: 600MHz • The DVFS mechanism is disabled.

  31. Power Model • Relation between power consumption, core frequency, and load. • bzip2

  32. Scenario I – 2 Big and 2 Little • Dual-core VM. • Two sets of input: • Case 1: Both VMs with light workloads. • 250MHz for each virtual core. • Case 2: One VM with heavy workloads, the other with modest workloads. • Heavy:1200MHz for each virtual core • Modest:600MHz for each virtual core.

  33. Scenario I - Results • Case 1: asymmetry-aware method is about 43.2% of that of credit-based method. • Case 2:asymmetry-aware method uses 95.6% of energy used by the credit-base method.

  34. Scenario 2 – 4 Big and 4 Little • Quad-core VM. • Three cases

  35. Scenario 2 - Results • In case 3, the loading of physical cores are 100% using both methods. • Cannot save power if the computing resources are not enough.

  36. Evaluation • Conduct simulations to compare the power consumption of our asymmetry-aware scheduler with that of a credit-based scheduler. • Compare the numbers of core switching from our greedy heuristic and that from an optimal solution.

  37. Setting • 25 sets of input • 4 physical cores, 12 virtual cores, 24 distinct execution slices. • Optimal solution • Enumerates all possible permutations of the execution slices. • Use A* search to reduce the search space.

  38. EvaluationResult

  39. Xen Hypervisor Scheduler:code Study

  40. Xen Hypervisor • Scheduler: • xen/common/ • schedule.c • sched_credit.c • sched_credit2.c • sched_sedf.c • sched_arinc653.c

  41. xen/common/schedule.c • Generic CPU scheduling code • implements support functionality for the Xen scheduler API. • scheduler: default to credit-base scheduler • static void schedule(void) • de-schedule the current domain. • pick a new domain.

  42. xen/common/sched_credit.c • Credit-based SMP CPU scheduler • static structtask_slicecsched_schedule; • Implementation of credit-base scheduling. • SMP Load balance. • If the next highest priority local runnable VCPU has already eaten through its credits, look on other PCPUs to see if we have more urgent work.

  43. xen/common/sched_credit2.c • Credit-based SMP CPU scheduler • Based on an earlier version. • static structtask_slice csched2_schedule; • Select next runnable local VCPU (i.e. top of local run queue). • static void balance_load(const struct scheduler *ops, intcpu, s_time_t now);

  44. Scheduling Steps • Xen call do_schedule() of current scheduler on each physical CPU(PCPU). • Scheduler selects a virtual CPU(VCPU) from run queue, and return it to Xen hypervisor. • Xen hypervisor deploy the VCPU to current PCPU.

  45. Adding Our Scheduler • Our scheduler periodically generates a scheduling plan. • Organize the run queue of each physical core according to the scheduling plan. • Xen hypervisor assigns VCPU to PCPU according to the run queue.

  46. Current Status • We propose a three-phase solution for generating a scheduling plan on asymmetric multi-core platform. • Our simulation results show that the asymmetry-aware strategy results in a potential energy savings of up to 56.8% against the credit-based method. • On going: implement the solution into Xen hypervisor.

  47. Questions or Comments?

More Related