380 likes | 521 Views
System Virtual Machines Chapter 8.3 ~ 8.7. October 25, 2006 Yoo Jonghun jhyoo@redwood.snu.ac.kr RTOS Lab., SoEECS, SNU. Presentation Outline. Resource Virtualization – Input/Output Performance Enhancement of System Virtual Machines Case Study: VMware Virtual Platform
E N D
System Virtual MachinesChapter 8.3 ~ 8.7 October 25, 2006 Yoo Jonghun jhyoo@redwood.snu.ac.kr RTOS Lab., SoEECS, SNU
Presentation Outline • Resource Virtualization – Input/Output • Performance Enhancement of System Virtual Machines • Case Study: VMware Virtual Platform • Case Study: The Intel VT-x (Vanderpool) Technology
Virtualizing Devices • Dedicated devices • Dedicated for long time while the guest VM is active • E.g., Display, keyboard, mouse • Partitioned devices • Partitioned for each guest VM • E.g., Disk • Shared devices • Shared among guest VMs at a fine time granularity • E.g., Network adapter
10000 20000 30000 Virtual Machine 1 Spool Table VMM Spool Table Virtual Machine 2 Spool Table Real loc Size Location Program Size VM Status Status Status Real loc 11000 400 Printed A 30000 Printed 1000 1 400 A Real loc Location Status Program Size 800 12000 B Completed 31000 2 Printing 200 Q 2000 P 1000 Running 21000 400 Waiting 200 3000 31800 B 1 C Running 200 13000 22000 800 2000 Completed Q 500 D Waiting 4000 500 Completed 30400 1 D 14000 Virtualizing Devices • Spooled devices • Shared among guest VMs at a much higher granularity • E.g., Printer • Two level spool table
Virtualizing Devices • Nonexistent physical devices • E.g., Virtual network adapter • Virtual NIC in VMware
Virtualizing I/O Activity • I/O action • Major interfaces in I/O action • Possible interception points • System call interface • Device driver interface • Operation-level interface Application System calls Operating system driver calls VMM I/O drivers Physical memory and I/O operations Hardware
Virtualizing I/O Activity • Virutualizing at the I/O operation level • Instructions for I/O operation • Processors with memory-mapped I/O : load or store from/to a specific memory • System/360 or IA-32 : Special I/O instructions • Above instructions are easy to intercepted by VMM • However, It is extremely difficult for the VMM to determine exactly what I/O action is being requested
Virtualizing I/O Activity • Virutualizing at the device driver level • VMM should have knowledge of the guest OS • Typical guest OS: Windows, Linux • Virtual device drivers for guest OS can be distributed to users • Drivers in host OS can be used in case of host VM • Virutualizing at the system call level • VMM should have much broader knowledge of the guest OS to emulate ABI level operations
I/O virtualization in hosted VMs • It is not necessary to provide device drivers in the VMM • Device drivers of host OS are used indirectly • VMM-n (native) • Intercepts traps • For performance critical small device drivers • VMM-u (user) • Uses host OS’s device drivers • VMM-d (driver) • Communication between VMM-n and VMM-u
Presentation Outline • Resource Virtualization – Input/Output • Performance Enhancement of System Virtual Machines • Case Study: VMware Virtual Platform • Case Study: The Intel VT-x (Vanderpool) Technology
Reasons for performance degradation • Setup • Setting resources when a VM is activated • Emulation • Sensitive instructions must be emulated • Interrupt handling • Interrupts must be handled by VMM fisrt • State saving • Saving state of VM when control is transferred to VMM • Bookkeeping • E.g., Accounting of time charged to user • Time elongation • E.g., Accessing shadow page table
Solutions • H/W techniques for improve the performance • IBM VM/370 assist collection
Instruction emulation assists • Instruction emulation assists • The HW (via microcode) performs the emulation of special instructions • E.g., In System/370, 13 instructions are assisted by HW • LOAD PSW (LPSW), INSERT PSW KEY (IPK), INSERT STORAGE KEY (ISK), LOAD REAL ADDRESS (LRA), RESET REFERENCE BIT (RRB), SUPERVISOR CALL (SVC), SET STORAGE KEY (SSK), SET SYSTEM MASK (SSM), STORE CONTROL (STCTL), STORE AND AND SYSTEM MASK (STNSM), STORE THEN OR SYSTEM MASK (STOSM), SET PSW KEY FROM ADDRESS (SPKA) VMM determines if the guest VM is in system mode or in user mode PSW is loaded with the corresponding value if the guest VM is in system mode LPSW traps Assisted by HW
Virtual machine monitor assists • Virtual machine monitor assists (1) • Context switch between VM and VMM • HW save/restore registers • Decoding of privileged instructions • Privileged instructions always traps whereas they trap only in user mode in a native environment • HW decodes privileged instructions to help VMM
Virtual machine monitor assists • Virtual machine monitor assists (2) • Virtual interval timer • While the guest VM is running, virtual timer in a certain memory location is decremented automatically by real timer • Additional instruction set • E.g.,Obtain free space from free storage area • Return space to free storage • Page lock/unlock • Translate virtual address and test for shared page • Invalidate segment/page table
Improving Performance of the Guest System • System/370 provides handshaking by which the guest OS send a message to VMM • Nonpaged mode • Turn off virtual memory of the guest OS • The guest OS disables dynamic address translation and defines its real address space to be as large as the largest virtual address space Page frames are mapped to fixed real pages • No double paging • No potential conflicts in paging decisions by the guest OS and the VMM • Pseudo-page-fault handling • When a page fault is cause, VMM gives back the control to the same VM • Improves fairness among VMs
Improving Performance of the Guest System • Spool files • When a file is ready to print out, the guest VM may issue a I/O operation which is intercepted by VMM • Instead of intercepting that, handshaking allows the VM to signal the VMM that a file is ready • Inter-virtual-machine communication • Save overhead of processing of message packets through communication layers • Paravirtualization • Interface presented by the VM is not identical to that of the architecture of the underlying processor, but rather simplified to eliminate the effect of critical instructions
Specialized Systems • Virtual-equals-real (V=R) virtual machine • Host address space representing the guest real memory is mapped one-to-one to the host real memory address space • Channel programs does not need to be retranslated • Shadow-table bypass assist • Multi-level mapping is very expensive • By assist of HW, trusted guests are allowed to access to the memory mapping table directly • IBM found that most guest operating systems well behaved
Specialized Systems • Preferred-machine assist • Allow a guest operating system to operate in system mode rather than user mode • Only minimal checks are imposed on the use of privileged instructions by the guest • Segment sharing • Sharing the code segments of the operating system among the virtual machines, provided the operating system code is written in a reentrance manner • Alleviate TLB pressure
Generalized Support for Virtual Machines • Interpretive Execution Facility (IEF) • The processor directly executes most of the functions of the virtual machine in hardware. • An extreme case of a VM assist. • Interpretive Execution Entry and Exit • Entry • Start Interpretive Execution (SIE) : The software give up control to the hardware IEF part and processor enters the interpretive execution mode. • Exit • Host Interrupt • Interception • Unsupported hardware instructions. • Exception during the execution of interpreted instruction. • Some special case…
Generalized Support for Virtual Machines VMM Software Entry into interpretive execution mode Interpretiveexecutionmode SIE Emulation Exit for interception Host interrupt handler Exit for host interrupt
Presentation Outline • Resource Virtualization – Input/Output • Performance Enhancement of System Virtual Machines • Case Study: VMware Virtual Platform • Case Study: The Intel VT-x (Vanderpool) Technology
Challenges for VMware • VMware is a popular virtual machine for IA32 • Challenges for IA32 • Not intended to support multiple users • Openness of system architecture • Different types of devices • Installation/Removal must be easy • Hosted VM is selected to cope with the above problems
VMware Components VMM-u VMM-d VMM-n
Processor Virtualization • IA-32 architecture is not efficiently virtualizable • Theorem 1 is violated • i.e., There are 17 instructions that are sensitive but not privileged • Hybrid VM • Discover critical instructions and patch them • Critical instructions • Protection system references • Reference the storage protection system, memory system, or address relocation system (e.g., mov ax, cs ) • Sensitive register instructions • Read or change resource-related registers and memory locations such as a clock register or interrupt registers (e.g., POPF)
Processor Virtualization • Problems • The sensitive instructions executed in user mode do not executed as correct as we expected unless the instruction is emulated • Solutions • The VM monitor substitutes the instruction with another set of instruction and emulates the action of the original code
Processor Virtualization • For example, popfd instruction • Popfd pops a word from the top of a stack and stores it in the EFLAGS register • One bit of EFLAGS is IF (Interrupt-enable Flag) • Is modified in system mode • Is unchanged in user mode • Solution • VMM scans the instruction stream • If it detects popfd, substitute it with set of instructions that take the processor into privileged mode and emulate popfd instruction
I/O Virtualization • The PC platform supports many more devices and types of devices than any other platform • Emulation in VMMonitor • Converting the in and out I/O to new I/O instructions • Requires some knowledge of the device interfaces Virtual Device Interface, e.g., IDE I/O Device Simulator in VMMonitor Hardware Device Interface, e.g., IDE, SCSI
I/O Virtualization • Using the services of the host operating system Virtual Device Interface, e.g., disk read, screen write I/O Device Simulator in VMMonitor I/O Device Simulator in VMApp OS Interface Commands, e.g., commands in graphics language Host Operating System, e.g., Linux, Windows Hardware Device Interface, e.g., IDE, SVGA
I/O Virtualization • New Capability for Devices Through Abstraction Layer • Undoable disk • The Disk on the VM can be treated as a file on host OS • Explicit command for perform disk write • Virtual Ethernet switch between a virtual NIC and a physical NIC • Reduce performance losses due to virtualization • Alternative user interface • A window can be used instead of whole display device
Memory Virtualization • Paging requests of the guest OS • Not directly intercepted by the VMM, but converted into disk read/writes. • VMMonitor translates it to requests on the host OS through VMApp. • Page replacement policy of host OS • The host could replace the critical pages of VM system in the competition with other host applications. • VMDriver’s critical pages pinning for virtual memory system.
Presentation Outline • Resource Virtualization – Input/Output • Performance Enhancement of System Virtual Machines • Case Study: VMware Virtual Platform • Case Study: The Intel VT-x (Vanderpool) Technology
Overview • VT-x (Vanderpool) technology for IA-32 processors • Conceptually similar to VM assists and Interpretive Execution • Available in recent CPUs • Pentium 4 6x2, Pentium D 9x0, Xeon, Core Duo, Core 2 Duo
The Intel VT-x (Vanderpool) Technology • Motivation • Virtualization problems of IA-32 architecture • Complexity of code and performance overhead • Main Feature • New VMX mode of operation • VMX root • Fully privileged, intended for VM monitor • VMX non-root • Not fully privileged, intended for guest software
vmresume VM1 vmlaunch VM1 vmlaunch VM2 vmresume VM2 vmresume VM2 vmxoff vmxon Regular Mode Root Mode (VMM) Non-Root (VM1) Non-Root (VM2) Regular Mode VM1 exits VM2 exits VM2 exits VM2 exits VM1 exits Technology Overview
Capabilities of the Technology • A Key aspect • The elimination of the need to run all guest code in the user mode • Maintenance of state information • Major source of overhead in a software-based solution • Hardware technique that allows all of the state-holding data elements to be mapped to their native structures • VMCS (Virtual Machine Control Structure) • Hardware implementation take over the tasks of loading and unloading the state from their physical locations
Maintenance of state information • The state of a virtual machine is maintained in the VMCS data structure