510 likes | 669 Views
System VMs. Chapter 8 System Virtual Machines . 2005.11.9 Dong In Shin Distributed Computing System Laboratory Seoul National Univ. . 1. 2. 3. 4. Performance Enhancement of System VMs. Case Study : Vmware Virtual Platform. Case Study : The Intel VT-x Technology. ** Case Study : Xen .
E N D
System VMs Chapter 8System Virtual Machines 2005.11.9 Dong In Shin Distributed Computing System Laboratory Seoul National Univ.
1 2 3 4 Performance Enhancement of System VMs Case Study : Vmware Virtual Platform Case Study : The Intel VT-x Technology ** Case Study : Xen Contents System VMs
Reasons for Performance Degradation • Setup • Emulation • Some guest instructions need to be emulated (usually via interpretation) by the VMM. • Interrupt handling • State saving • Bookkeeping • Ex. The accounting of time charged to a user • Time elongation System VMs
Instruction Emulation Assists • The VMM emulates the privilege instruction using a routine whose operation depends on whether the virtual machine is supposed to be executing in system mode or in user mode. • Hardware assist for checking the state and performing the actions. System VMs
Virtual Machine Monitor Assists • Context switch • Using hardware to save and restore registers • Decoding of privileged instructions • Hardware assists, such as decoding the privileged instructions. • Virtual interval timer • Decrementing the virtual counter by some amount estimated by the VMM from the amount that the real timer decrements. • Adding to the instruction set • A number of new instructions that are not a part of the ISA of the machine. System VMs
Improving Performance of the Guest System • Non-paged mode • The guest OS disables dynamic address translation and defines its real address space to be as large as the largest virtual address space. Page frames are mapped to fixed real pages. • The guest OS no longer has to exercise demand paging. • No double paging • No potential conflict in paging decisions by the guest OS system and the VMM System VMs
Double Paging • Two independent layers of paging will interact, perform poorly. Guest OS incorrectly believe a page to be in physical memory ( green/gold pages ) VMM believes an unneeded page is still in use (teal pages) Guest evicts a page despite available physical memory (red pages) System VMs
Pseudo-page-fault handling • A page fault in a VM system • A page fault in some VM’s page table • A page fault of VMM’s page table • Pseudo page-fault handling • Process • Initialize page-in operation from backing store. • Triggers guest ‘pseudo page fault’. • Guest OS suspends guest’s user process. • VMM does not suspend the guest. • On completion of page-in operation • VMM calls guest pseudo page fault handler again • Guest OS handler wakes up blocked user process. System VMs
The others… • Spool files • Without any special mechanism, VMM should intercept the I/O commands and decipher that the virtual machines are simultaneously attempting to send a job to the I/O devices . • Handshaking allows the VMM picks up the spool file and continues to merge this file into its own buffer. • Inter-virtual-machine communication • Communication between two physical machines involves the processing of message packets through several layers at the sender/receiver side • This process can be streamlines, simplified, and made faster if the two machines are virtual machines on the same host platform. System VMs
Specialized Systems • Virtual-equals-real (V=R) virtual machine • The host address space representing the guest real memory is mapped one-to-one to the host real memory address space. • Shadow-table bypass assist • The guest page tables can point directly to physical addresses if the dynamic address translation hardware is allowed to manipulate the guest page tables. • Preferred-machine assist • Allow a guest OS system to operate in system mode rather than user mode. • Segment sharing • Sharing the code segments of the operating system among the virtual machines, provided the operating system code is written in a reentrance manner. System VMs
Generalized Support for Virtual Machines • Interpretive Execution Facility (IEF) • The processor directly executes most of the functions of the virtual machine in hardware. • An extreme case of a VM assist. • Interpretive Execution Entry and Exit • Entry • Start Interpretive Execution (SIE) : The software give up control to the hardware IEF part and processor enters the interpretive execution mode. • Exit • Host Interrupt • Interception • Unsupported hardware instructions. • Exception during the execution of interpreted instruction. • Some special case… System VMs
Interpretive Execution Entry and Exit VMM Software Entry into interpretive execution mode Interpretiveexecutionmode SIE Emulation Exit for interception Host interrupt handler Exit for host interrupt System VMs
Full-virtualization Versus Para-virtualization • Full virtualization • Provide total abstraction of the underlying physical system and creates a complete virtual systems in which the guest operating systems can execute. • No modification is required in the guest OS or application. • The guest OS or application is not aware of the virtualized environment. • Advantages • Streamlining the migration of applications and workloads between different physical systems. • Complete isolation of different applications, which make this approach highly secure. • Disadvantages • Performance penalty • Microsoft Virtual Server and Vmware ESX Server System VMs
Full-virtualization Versus Para-virtualization • Para Virtualization • The virtualization technique that presents a software interface to virtual machines that is similar but not identical to that of the underlying hardware. • This techniques require modifications to the guest OS that are running on the VMs. • The guest OSs are aware that they are executing on a VM. • Advantages • Near-native performance • Disadvantages • Some limitations, including several insecurities such as the guest OS cache data, unauthenticated connections, and so forth. • Xen system System VMs
Vmware Virtual Platform • A popular virtual machine infrastructure for IA-32-based PCs and server. • An example of a hosted virtual machine system • Native virtualization architecture product Vmware ESX Server • This book is limited to the hosted system, Vmware GSX Server (VMWare2001) • Challenges • Difficulties to virtualize efficiently based on IA-32 environment. • The openness of the system architecture. • Easy Installation. System VMs
Vmware’s Hosted Virtual Machine Model System VMs
Processor Virtualization • Critical Instructions in Intel IA-32 architecture • not efficiently virtualizable. • Protection system references • Reference the storage protection system, memory system, or address relocation system. (ex. mov ax, cs ) • Sensitive register instructions • Read or change resource-related registers and memory locations (ex. POPF) • Problems • The sensitive instructions executed in user mode do not executed as correct as we expected unless the instruction is emulated. • Solutions • The VM monitor substitutes the instruction with another set of instruction and emulates the action of the original code. System VMs
Input/Output Virtualization • The PC platform supports many more devices and types of devices than any other platform. • Emulation in VMMonitor • Converting the in and out I/O to new I/O instructions. • Requires some knowledge of the device interfaces. • New Capability for Devices Through Abstraction Layer • VMApp’s ability to insert a layer of abstraction above the physical device. • Advantages • Reduce performance losses due to virtualization. • Ex) Virtual Ethernet switch between a virtual NIC and a physical NIC. System VMs
Using the Services of the Host Operating System • The request is converted into a host OS call. • Advantages • No limitations for VMM’s access of the host OS’s I/O features. • Running the Performance-Critical applications System VMs
Memory Virtualization • Paging requests of the guest OS • Not directly intercepted by the VMM, but converted into disk read/writes. • VMMonitor translates it to requests on the host OS throught VMApp. • Page replacement policy of host OS • The host could replace the critical pages of VM system in the competition with other host applications. • VMDriver’s critical pages pinning for virtual memory system. System VMs
Vmware ESX Server • Native VM • A thin software layer designed to multiplex hardware resources among virtual machines • Providing higher I/O performance and complete control over resource management • Full Virtualization • For servers running multiple instances of unmodified operating systems System VMs
Page Replacement Issues • Problem of double paging • Unintended interactions with native memory management policies between in guest operating systems and host system. • Ballooning • Reclaims the pages considered least valuable by the operating system running in a virtual machine. • Small balloon module loaded into the guest OS as a pseudo-device driver or kernel service. • Module communicates with ESX server via a private channel. System VMs
Inflating a balloon When the server wants to reclaim memory Driver allocate pinned physical pages within the VM Increase memory pressure in the guest OS, reclaim space to satisfy the driver allocation request Driver communicates the physical page number for each allocated page to ESX server Deflating Frees up memory for general use within the guest OS Ballooning in VMware ESX Server System VMs
Virtualizing I/O Devices on VMware Workstation • Supported virtual devices of VMware • PS/2 keyboard, PS/2 mouse, floppy drive, IDE controllers with ATA disks and ATAPI CD-ROMs, a Soundblaster 16 sound card, serial and parallel ports, virtual BusLogic SCSI controllers, AMD PCNet Ethernet adapters, and an SVGA video controller. • Procedures • Intercept I/O operations issued by the guest OS. ( IA-32 IN and OUT ) • Emulated either in the VMM or the VMApp. • Drawbacks • Virtualizing I/O devices can incur overhead from world switches between the VMM and the host • Handling the privileged instructions used to communicate with the hardware System VMs
Overview • VT-x (Vanderpool) technology for IA-32 processors • enhance the performance VM implementation through hardware enhancements of the processor. • Main Feature • The inclusion of the new VMX mode of operation (VMX root/non-root operation) • VMX root operation • Fully privileged, intended for VM monitor New instructions – VMX instructions • VMX non-root operation • Not fully privileged, intended for guest software • Reduces Guest SW privilege w/o relying on rings System VMs
vmresume VM1 vmlaunch VM1 vmlaunch VM2 vmresume VM2 vmresume VM2 vmxoff vmxon Regular Mode Root Mode (VMM) Non-Root (VM1) Non-Root (VM2) Regular Mode VM1 exits VM2 exits VM2 exits VM2 exits VM1 exits Technological Overview System VMs
VM 1 VM 2 VM n Ring 3 Ring 3 Ring 3 Ring 0 Ring 0 Ring 0 VMCS n VMCS 1 VMCS 2 Ring 3 Ring 0 VT-x Operations VMX Non-root Operation . . . VM Exit VMX Root Operation IA-32 Operation VMLAUNCH VMRESUME VMXON System VMs
Capabilities of the Technology • A Key aspect • The elimination of the need to run all guest code in the user mode. • Maintenance of state information • Major source of overhead in a software-based solution • Hardware technique that allows all of the state-holding data elements to be mapped to their native structures. • VMCS (Virtual Machine Control Structure) • Hardware implementation take over the tasks of loading and unloading the state from their physical locations. System VMs
Virtual Machine Control Structure (VMCS) • Control Structures in Memory • Only one VMCS active per virtual processor at any given time • VMCS Payload • VM execution, VM exit, and VM entry controls • Guest and host state • VM-exit information fields System VMs
Xen Design Principle • Support for unmodified application binaries is essential. • Supporting full multi-application operating system is important. • Paravirtualization is necessary to obtain high performance and strong resource isolation. System VMs
Xen Features • Secure isolation between VMs • Resource Control and QoS • Only guest kernel needs to be ported • All user-level apps and libraries run unmodified. • Linux 2.4/2.6 , NetBSD, FreeBSD, WinXP • Execution performance is close to native. • Live Migration of VMs between Xen nodes. System VMs
Xen 3.0 Architecture System VMs
Xen para-virtualization • Arch Xen/X86 , replace privileged instructions with Xen hypercalls. • Hypercalls • Notifications are delivered to domains from Xen using an asynchronous event mechanism • Modify OS to understand virtualized environment • Wall-clock time vs. virtual processor time • Xen provides both types of alarm timer • Expose real resource availability • Xen Hypervisor • Additional protection domain between guest OSes and I/O devices. System VMs
X86 Processor Virtualization • Xen runs in ring 0 (most privileged) • Ring 1,2 for guest OS, 3 for user-space • Xen lives in top of 64MB of linear address space. • Segmentation used to protect Xen as switching page tables too slow on standard X86 • Hypercalls jump to Xen in ring 0 • Guest OS may install ‘fast trap’ handler • MMU-virtualization : shadow vs. direct-mode System VMs
Para-virtualizing the MMU • Guest OS allocate and manage own page-tables • Hypercalls to change PageTable base. • Xen Hypervisor is responsible for trapping accesses to the virtual page table, validating updates and propagating changes. • Xen must validate page table updates before use • Updates may be queued and batch processed • Validation rules applied to each PTE • Guest may only map pages it owns • XenoLinux implements a balloon driver • Adjust a domain’s memory usage by passing memory pages back and forth between Xen and XenoLinux System VMs
MMU virtualization System VMs
Writable Page Tables System VMs
I/O Architecture • Asynchronous buffer descriptor rings • Using shared-memory • Xen I/O-Spaces delegate guest Oses protected access to specified h/w devices • The guest OS passes buffer information vertically through the system. • Xen performs validation checks. • Xen supports a lightweight event-delivery mechanism which is userd for sending asynchronous notifications to a domain. System VMs
Data Transfer : I/O Descriptor Rings System VMs
Device Channel Interface System VMs
Performance System VMs