Xen and the Art of Virtualization

Xen and the Art of Virtualization A paper from the University of Cambridge, presented by Charlie Schluting For CS533 at Portland State University

Virtualization • Originally used to provide multiple environments identical to the host OS • Then, virtual machines were used to isolate applications from the machine: Java • And now we have hardware emulation

Emulation types • Full hardware emulation: completely simulates hardware, guest OS run unmodified • Paravirtualization: not emulated, but the host (virtual machine) provides an API for guests. This is Xen. • Native: limited emulation, just enough to allow the unmodified guest to run on the hardware and still provide isolation

Xen • Why? Servers sit idle most of the time. • Paravirtualization, guest OS’s must be ported to use the API calls. • Design goals: • They don’t care about OS compatibility • They strongly care about resource provisioning and security (these two things go together, too) • Also, performance shouldn’t be compromised.

The Xen Way • They state a few reasons why full virtualization is so slow: • x86 requires many things to occur in privileged-mode execution, but attempts to execute them in non-privileged mode fails silently instead of causing a trap. • All non-trapping privileged instructions must be caught and handled. • Virtualizing the MMU is also difficult

The Xen Way, cont’d • Guest OS’s also need to see the real hardware in some cases. • TCP, e.g. uses RTT calculations to determine window sizes (buffer of sent but unacknowledged data). • Real machine memory addresses allows the guest OS to perform better and page properly. • Xen provides a “machine abstraction that is similar but not identical to the underlying hardware” [1].

Memory Management • Sadly, x86 doesn’t support a software-managed TLB (misses are handled in hardware automatically). • Xen makes sure all valid translations are present in the page table, and the guest OS manages its own paging in a shadow table. • Xen exists in the top 64MB of every address space, to avoid TLB flushes when entering/leaving Xen.

Memory Management • Guests allocate their own page tables and register them with Xen. • Future updates are validated by Xen, i.e. “registering” involves the guest giving up write capabilities to the page table memory. • This ensures security: guests can’t map memory that doesn’t belong to them.

Physical Memory • The allocation for each guest OS happens all at once. So memory is partitioned, but can dynamically grow. • Xen provides a translation array, that all domains can read, to map virtual to physical addresses.

CPU • The CPU is simulated, in a sense. • Xen is run at a higher level than all the guest OS’s. • Guest OS’s are modified to run at a lower level (i.e. the OS is an application). • The guest OS protects itself by running in its own address space. Context switches are done through Xen.

CPU • x86 has rings 1 and 2 that are rarely used. So guests are modified to run in 1. • They still can’t run privileged instructions, but they’re isolated from applications running in ring 3. • Handlers are registered with Xen to validate exceptions.

Page Faults • OS’s normally read the faulting address from a privileged register. Xen can’t, if it runs in ring 1. • Xen’s handler (running in 0 in the host OS) creates an “extended stack frame” where the address is copied to. • Control is then returned to the guest OS. • Page faults are special, but other system calls can be dealt with via their Fast Exception Handler, that bypasses ring 0 by installing the registered hander in the hardware exception table.

Device I/O • No emulation! • Data is passed via shared memory to Xen. • Passsing data is done with i/o rings. • Ring: circular queue of descriptors allocated by the guest. • Use producer/consumer pointers to signal Xen or the guest that data is ready.

Network I/O • A packet is sent by placing the file descriptor in a transmit queue. • Packets are never copied between guest and Xen! • Xen also implements “rules” that can be used as a firewall. Packets are inspected before being sent to the upper layer, the guest in this case, just like the OS does already.

Disk I/O • Guests are give X amount of virtual disk in the beginning. • Virtual Block Devices are presented. • Xen handles the translation onto real disk. • Once Xen validates, DMA to the memory in the guest is allowed to happen. • Zero-copy.

Performance is Key • But OS’s have to be modified… • It’s only about 3000 lines of code in Linux. • They’re working on Windows.. But will never be able to release it.

Lots of performance stats

Performance • Like they said… only 8% slower most of the time. • Interesting, that they choose to test a “MTU” of 500 on a gigabit network:

Performance When Running Many Xen’s • Their goal was to scale to 100 guests running at a time. • On dual proc machines, performance (in all tests) was nearly double when running two OS’s. • Adding more, of course slowed everything down. • Xen outperforms everything else, though. • Confusing graphs omitted.

Performance is a Security Issue Too • They ran many benchmarks again, but with malicious guest OS’s running at the same time. • One was running a fork bomb, one was trying to allocate 3GB of RAM, freeing, and starting over, and another was copying huge amounts of data from disk. • The benchmarks were only mildly effected!

Xen, Since the Paper • New, or unmentioned features: • Xen can be “live migrated.” • According to the webpage, IA64 and PPC ports are underway. • FreeBSD and NetBSD ports have been completed. Novell ships Suse with Xen now. • Intel contributed to support their Vanderpool extensions… so unmodified VMs can run on Xen. But slower, obviously.

Go Install Xen!

Xen and the Art of Virtualization