230 likes | 323 Views
A Secure System-wide Process Scheduling across Virtual Machines. Hidekazu Tadokoro ( Tokyo Institute of Technology ) Kenichi Kourai (Kyushu Institute of Technology) Shigeru Chiba ( Tokyo Institute of Technology ). Scheduling Problem across VMs. VM. VM. OS. OS. Indexing. WEB.
E N D
A Secure System-wide Process Scheduling across Virtual Machines Hidekazu Tadokoro (Tokyo Institute of Technology) Kenichi Kourai (Kyushu Institute of Technology) Shigeru Chiba (Tokyo Institute of Technology)
Scheduling Problem across VMs VM VM OS OS Indexing WEB VMM • Server consolidation using virtual machines(VMs) • To improve the resource utilization • VMs make it difficult to execute processes as administrators intend • Guest OSes schedule only their processes • A low-priority process in a VM may interfere with a high-priority in other VMs Hardware
System-wide Process Scheduler system-wide scheduler run indexing check VMs are idle Indexing VM VM VMM • Necessary for scheduling processes across VMs • It can suppress the execution of less important process • Because it knows important processes among all VMs • E.g. it can run the file indexing process only when the whole system is idle
Issue: Difficult to Implement 1), 2) VM VM ???? ???? semantics gap what process is running? 1) Guest-aware VM scheduling [Euro-Par’08 Kim et al.] 2) ask grain scheduling [HPCC’08 Kinebuchi et al.] VMM • Implementing a system-wide process scheduler in the VMM is unsuitable • VMM cannot recognize the process • Processes are abstraction of OSes • Passing information of processes to VMM requires modification of guest Oses • Modification of guest OSes is often unacceptable
Issue: Vulnerable to a DoS Attack system-wide scheduler VMs are NOT idle never run Indexing VM VM malicious loop VMM • A process in a compromised VM can prevent processes in other VMs through the scheduler • E.g. a busy loop process can easily stop the file indexing process in other VMs • The indexing is configured to run at idle time
Monarch Scheduler VM VM Indexing WEB change scheduling VMM Monarch Scheduler • A system-wide process scheduler in the VMM • manipulate internal data in guest OSes for process scheduling • recognize the process • Hybrid scheduling to mitigate a DoS attack • Periodically switches between system-wide process scheduling and original scheduling
Process Scheduling by the VMM run queue VM process modify memory Monarch Scheduler • VMM monitors and manipulates the run queue and the process structure in guest OSes • Suspending a process • Remove from the run queue • Rewrite its state to stop spontaneously • Resuming a process • Insert it into a run queue
Hybrid Scheduling controlled autonomous VM VM VM VM malicious loop malicious loop indexing indexing switch stop run freely Monarch Scheduler Monarch Scheduler • To guarantee some CPU time to every process • Periodically switches two modes • Controlled mode: performs system-wide scheduling • Autonomous mode: stops system-wide scheduling • VMM and guest OSes are perform their own original scheduling
Implementation run queue DomainU process interrupt schedule Xen Monarch Scheduler • We implemented in Xen 3.4.2 • Supported guest OS is Linux 2.6 (x86_64) • Scheduler is invoked by timer interrupts in VMM • Pause a DomainU • To prevent conflict between the Monarch scheduler and the guest OS • Get the CPU time of each process • Schedule when the controlled mode
Accessing Kernel Data page table DomU machine memory P2M table kernel image virtual address • The Monarch scheduler accesses the internal data of guest OSes based on their information • Obtain debug information from kernel image in advance • Translate virtual addresses of domainU into machine addresses of the VMM at run time • Page tables of guest OSes • P2M tables Xen VMM
Finding process structures • Linux kernel init_task • The Monarch scheduler traverses a process list • Every process structure is linked to the list • The starting point is init_task • The address of init_task is invariant in each kernel image
Finding Run Queues structx8664_pda { task_t* current; ulongdata_offset; …}; Linux memory GS register x8664_pda data_offset+PER_CPU_RUNQUEUES run queue • The Monarch scheduler finds a run queue for each v-CPU • The address is unknown until boot of the guest OS • The number of v-CPUs is not determined until boot • The starting point is GS register of each v-CPU • The GS points x8664_pda, which contains a pointer to a run queue
Guaranteeing Consistency runqueue scheduler of Linux OS spinlock schedule() { spin_lock(runqueue); RUN QUEUE OPERATION spin_unlock(runqueue); } Monarch Scheduler lock check unlock • The Monarch scheduler checks a lock of the data structure • To guarantee that the guest is not accessing the data whenever the Monarch scheduler accesses it • Acquiring the lock is not needed • The domain is paused
Monitoring Process Time process CR3 track change of CR3 bind CR3 to process Monarch Scheduler • The Monarch scheduler records the execution time of each process • It tracks the switches of virtual address spaces • By trapping modification of the CR3 register • It binds virtual address spaces to processes • By using process information in guest Oses • Time recorded by guest OSes is inaccurate
Experiments Core 2 Duo 2.4 GHz Memory 6GB Xen 3.4.2 Dom0: Linux 2.6.18.8 DomU: Linux 2.6.16.33 (1GB) • Examining overheads • Scheduling overheads • Monitoring overheads • Performance degradation • Examining the scheduling behavior • System-wide idle-time scheduling • Hybrid scheduling with the idle-time scheduling • Examining the impact of update the guest OS
Scheduling Overheads • Time for traversing the process list • Change the number of processes in one VM • Change the number of VMs with fixed number of processes • Traversing time is negligible in the schedule • 36ns/proc • 880ns/VM
Monitoring Overheads • Time for recording the execution time of processes with CR3 • The total number of context switches per second • Overhead is negligible
Performance Degradation Throughput Response time • Throughput and response time of lighttpd • Changing scheduling interval • Only traversing the process list • Changing the number of processes • Slightly degraded when the interval is 10ms
System-wide Idle-time Scheduling with scheduler without scheduler VM1 VM2 lighttpd HyperEstraier Xen VMM run only at idle time • Examining that the Monarch scheduler correctly archives the idle-time scheduling • Stop HyperEstraier whenever lighttpd runs • The Monarch scheduler archived the policy • HyperEstraier degrades lighttpd without scheduling
Hybrid Scheduling • Examining the effectiveness of hybrid scheduling • Changing the ratio of the autonomous mode • The indexing process was executed according to the ratio of autonomous mode • A steep rise of CPU utilization when more than 80%
Impact of Updating the Guest OS • How much the Monarch scheduler has to be modified when the Linux kernel is updated • Inspected 33 versions of the Linux kernel 2.6
Related Work • Guest-aware VM scheduling [Euro-Par’08 Kim et al.] • Guest OSes notify the VMM of their highest priority • Modification of guest OSes is required • Task grain scheduling [HPCC’08 Kinebuchi et al.] • Guest OSes notify L4 of priorities of all processes • Not suitable for Xen due to frequent VM switches • Task-aware VM scheduling [VEE’09 Kim et al.] • Using gray-box knowledge • Not for process scheduling
Conclusion • Monarch scheduler • A secure system-wide process scheduler running in the VMM • monitor the execution of processes • change the scheduling behavior of each guest OS • provide hybrid scheduling to mitigate a DoS attack • Future work • Completion of the support for Windows guest OS