220 likes | 385 Views
Checkpoint/Restore in the Palacios Virtual Machine Monitor. EECS 441 – Resource Virtualization Steven Jaconette, Eugenia Gabrielova, Nicoara Talpes Instructor: Peter Dinda. Agenda. Background Motivation Design Implementation Future work. Virtual Machine Monitors. Virtual Machine:
E N D
Checkpoint/Restore in the Palacios Virtual Machine Monitor EECS 441 – Resource Virtualization Steven Jaconette, Eugenia Gabrielova, Nicoara Talpes Instructor: Peter Dinda
Agenda Background Motivation Design Implementation Future work
Virtual Machine Monitors Virtual Machine: Software emulation or virtualization of a machine Virtual Machine Monitor (VMM): Allow multiple OS's to access physical machine resources Each guest OS believes it is running directly on hardware
Palacios VMM • Virtual Machine Monitor developed at Northwestern • Targeted at the Red Storm supercomputer at Sandia National Laboratories • Linked into a Host OS, allows both 64 and 32 bit guests. • Provides guests with functionality of Intel/AMD processor, memory, interrupts, and hardware devices.
Checkpoint / Restore • Checkpoint: suspending a running OS instance, and copy it somewhere else (kernel, disc) • Restore: copy the OS instance to its destination • Used as part of OS migration • Useful when you know a machine will fail and want to move memory to a different place fast
Motivation Palacios cannot currently put guests to sleep, restore with memory intact This functionality is the first step toward live-migration of guests Both checkpointing and live-migration have important applications in supercomputing
Checkpoint / Restore in Other Systems Used in live OS migration VMware Virtual Center: quiescing the VM after the pre-copy state Xen Virtual Machine Monitor, same procedure as VMWare: OS instance suspends itself, is moved to destination host. Then the suspended copy of the VM state is resumed
Guest State in Palacios • Structures that make up VM: VMCB, Registers, pointers from guest info • Devices • Interrupts • Static Information • Pointers
Guest State struct guest_info { ullong_t rip; uint_t cpl; addr_t mem_size; // Probably in bytes for now.... v3_shdw_map_t mem_map; struct vm_time time_state; v3_paging_mode_t shdw_pg_mode; struct shadow_page_state shdw_pg_state; addr_t direct_map_pt; // nested_paging_t nested_page_state; // This structure is how we get interrupts for the guest struct v3_intr_state intr_state; v3_io_map_t io_map; struct v3_msr_map msr_map; // device_map struct vmm_dev_mgr dev_mgr; struct v3_host_events host_event_hooks; v3_vm_cpu_mode_t cpu_mode; v3_vm_mem_mode_t mem_mode; struct v3_gprs vm_regs; struct v3_ctrl_regs ctrl_regs; struct v3_dbg_regs dbg_regs; struct v3_segments segments; v3_vm_operating_mode_t run_state; void * vmm_data; uint_t enable_profiler; struct v3_profiler profiler; void * decoder_state; v3_msr_t guest_efer; /* Do we need these ? */ v3_msr_t guest_star; v3_msr_t guest_lstar; v3_msr_t guest_cstar; v3_msr_t guest_syscall_mask; v3_msr_t guest_gs_base;};
Design 1: Serialization • "Flatten" guest state information at checkpoint • Not all guest information should be checkpointed • Devices, Interrupts • Static data from XML files • Restore from saved guest state information • Similar to configuring virtual machine at boot
Design 2: Per-Guest Heap with Pointer Tagging • For each guest's heap: checkpoint heap and restore it to address space • Starting address for heap could be different after copy • Make sure pointers are not pointing to the wrong memory addresses by fixing them up • During copy, record start of heap and track the pointers for the addresses in the heap and save them as offsets • Problem: mallocs in external libraries and void pointers
Design 3: Per-Guest Heap withUser Space Mapping • Create a "per-guest" heap for each VM, as before. • Do not tag/fix pointers. • Map the heap to a well-known address in user space. • Mark the pages as "system" to prevent modification • On a checkpoint, copy from this address • For a restore, copy back to it • Change between VMs through process context switches.
Implementation • In order to create a per-guest heap, we must allocate a chunk of memory to represent the heap • The Host OS provides Palacios with malloc/free functions • Currently these are kitten kernel memory allocator functions • In order to allocate out of our chunk, we needed to define new allocation functions
Implementation • Checkpoint / Restore • Checkpoint: Find next available location in user space, then copy relevant info • Queue of previously checkpointed guest data locations • Restore: Get checkpoint address from queue, copy back to guest heap
Future Work More coding is needed to test this design. Has potential to greatly simplify checkpoint/restore of virtual machines What's next: Live-migration of guests User space per-guest heaps in a different host OS