400 likes | 576 Views
Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines Joshua Le Vasseur, Volkmar Uhlig, Jan Stoess, Stefan Gotz – OSDI-2004. Raju Kumar CS598C: Virtual Machines. Introduction. Device Drivers - 70% of Linux 2.4.1 code for IA32 New OS Rewrite drivers
E N D
Unmodified Device Driver Reuse and Improved System Dependability via Virtual MachinesJoshua Le Vasseur, Volkmar Uhlig, Jan Stoess, Stefan Gotz – OSDI-2004 Raju Kumar CS598C: Virtual Machines
Introduction • Device Drivers - 70% of Linux 2.4.1 code for IA32 • New OS • Rewrite drivers • Reuse drivers from other OS • Unavailable code • Undocumented features • Extent of programming errors
Contribution • Unmodified reuse of existing device drivers • Strong isolation among device drivers • Fault containment • Extent of collocation
Related Work - Reuse • Binary driver reuse – cohosting in VMware Workstation • Both driver OS and VM run with all privileges!! • Transplanting • Uses glue • Raises conflicts • Leads to compromises in new OS • Both driver and VM still run with all privileges
Related Work – Semantic Resource Conflicts • Semantic Resource Conflicts • Accidental denial of service • Sharing Conflicts • Transplanted driver and host OS prone to each other’s faults • Since driver and OS both have all privileges, cooperation is required • Cooperation not possible with transplanting • Device driver disables interrupts
Related Work – Engineering Effort • Are reused drivers functioning correctly ? • Even with transplanting, 12% of OS-Kit code = glue • Glue provides • Ways to handle semantic differences • Interface translation • Donor OS knowledge required to write glue • What if multiple donor OS-s ? Writing glue code is even more difficult. • What if driver code in donor OS gets updated ?
Related Work - Dependability • User level device drivers • Used with some differences • Nooks • Isolates drivers within protection domains • No privilege isolation • Complete fault isolation not possible • Detection of malicious drivers not possible • Adds 22,000 lines of privileged code to Linux • Uses interposition services to maintain integrity of resources shared between drivers • No sharing of resources between drivers in this work – uses request messages
Approach • Drivers are closely knit to kernel, applications are not • Orthogonal drivers should be based on following principles • Resource delegation • Receive only bulk resources • Separation of name spaces • Driver has its own address space • Separation of privilege • Execute driver in unprivileged mode • Secure isolation • Among drivers, between drivers and applications • Common API
Analysis of principles • Most flouted for device drivers • None flouted for OS • Insight – transplant OS, rather than just driver
Architecture • DD/OS – OS running a device driver • DD/OS hosted in a VM • Driver controls its device directly via a pass-through enhancement to VM hosting DD/OS • Driver cannot access other DD/OS • Translation module – added to DD/OS to interface with clients • One translation module can be used for multiple DD/OS-s • Hard disks, floppy disks, optical media, etc. • Drivers execute in separate VMs • Driver isolation from each other • Simultaneous use of drivers from incompatible OS-s
Virtual Machine Environment • Hypervisor • VMM • DD/OS-s • Clients in VMs • Translation modules
Inter VM • Low overhead communication • Message notification • Source VM raises communication interrupt in Destination VM • Request completion • Destination VM raises completion interrupt in Source VM • Low overhead memory sharing • Register memory areas of a VM into another VM’s physical memory space
Requests and Responses • Client signals DD/OS – VMM sends virtual interrupt to translation module • DD/OS signals client – Translation module raises a trap in VMM
Enhancing Dependability • Driver isolation • Improve reliability • By preventing fault propagation • Improve availability • Virtual machine reboot • Continuum of configurations • Individual drivers vs group of drivers
Driver Restart • Asynchronous – Reset driver • Fault detection • Malicious driver • Synchronous – Negotiations and quiescing • Live upgrades • Proactive restart • Indirection captures accesses to a restarting driver • Transparently started • Fault signaled
Virtualization Issues • DD/OS consumes more resources than drivers • DMA operations • Special timing needs of physical hardware violated • Host OS has to collaborate with DD/OS to control driver
DMA address translation • DMA addresses in DD/OS reference guest physical address • not same as host physical address • Translation • VMM intercepts DMA access and translates
DMA and Security • DD/OS can perform DMA to physical memory not allowed by memory protection system !! • Use DMA to replace hypervisor code/data • In absence of hardware support to restrict DMA access, device drivers are part of TCB
DMA and Trust • Untrusted by hypervisor • Client • Client and DD/OS • Client and DD/OS + they do not trust each other • When DD/OS is untrusted • Hypervisor enables DMA permissions to client memory • Restricts DD/OS’s actions in client memory • When DD/OS and client do not trust each other • Client pins its own memory • DD/OS verifies pinning of client’s memory via hypervisor
DMA and Trust contd… • VM faults and restarts while device is using DMA !! • All targeted memory cannot be reclaimed until all such DMA operations complete or abort • What is “targeted memory” ? DD/OS memory ? Client’s pinned memory ? • No solution provided to this problem!! • Client with memory pinned due to a DD/OS that faulted and is rebooting should not use pinned memory until restart has completed • And then what ? Will the DD/OS signal completion ? What if DMA completes before the VM restarts ? What if VM fails to start at all ?
IO-MMU and IO Contexts • IO-MMU • Designed to overcome 32-bit address limitation for DMA in 64-bit systems • Can be used to enforce access permissions for DMA operations and address translation • Hence DD/OS are hardware isolated • Hence device drivers can be excluded from TCB • More questions – So does this work assume device drivers in TCB or not in TCB ? If in TCB, we cannot do anything. If not in TCB, then driver cannot do anything malicious due to hardware isolation, so we do not need to do anything. So?
IO-MMU contd… • IO-MMU does not support multiple address contexts • Time multiplex IO-MMU between PCI devices • Timeouts may occur in several device drivers • Question – How many PCI devices are there generally in a system ? But eventually the various device drivers will be the deciding granularity. So would it be a better idea to group all device drivers in one DD/OS and avoid all contention ? If yes, we have a tradeoff between performance and fault isolation. • Impact on gigabit ethernet NIC proportional to bus access • Decrease impact of multiplexing by using dynamic bus allocation based on • Device utilization – prefer active and asynchronous devices • Have to use IO-MMU to ensure device driver isolation. No options yet.
Resource Consumption • OS size of driver modules • Periodic tasks in DD/OS lead to cache and TLB footprints • Question – paper claims periodic tasks in DD/OS impose overhead on clients even when not using any device driver. How ? • Page Sharing uses schemes used in VMware ESX Server • Steady state cache footprint of multiple DD/OS-s is low due to high sharing • Swap out VM pages to disk • Do not swap out pages for VM hosting DD/OS for swap device • Do not swap out pages for VM hosting DD/OS used by swap device • More questions • When treating the DD/OS as a black box, we cannot swap unused parts of the swap DD/OS via working set analysis. All parts of the OS must always be in main memory to guarantee full functionality even for rare corner cases. • Black Box - Do not know which pages are used. All parts of OS must always be in main memory. Then what can be paged out ? How do we find it ?
Reducing Memory Footprint • In addition to memory sharing and swapping • Memory ballooning inside DD/OS • Does it acquire pages and zero them out ? Details not provided. • Handles zero pages specially • Compresses non-working set pages that cannot be swapped and uncompresses them upon access • Periodic tasks increase DD/OS footprint • Do not meet strict requirements
Timing • Virtual Time vs Real Time • Devices malfunction under violation of assumptions related to time • Soft preemption • If interrupts disabled, VMM does not preempt VM until interrupts are enabled • Hard preemption • Preempt even if interrupts disabled
Shared Hardware and Recursion • Time sharing of devices is needed • Time sharing PCI is difficult • Let a DD/OS control PCI • This DD/OS interposes access to the PCI and applies a policy
Results • Implemented a driver reuse system • Evaluated network, disk and PCI drivers • Hypervisor and VMM are paravirtualized systems
Virtualization Environment • Hypervisor • L4 • VMM • User level L4 task • DD/OS • Linux kernel 2.4.22 • Client OS • Linux kernel 2.6.8.1
Translation Modules • Disk interface • Added to DD/OS as a kernel module • Communicates with the block layer • Network interface • Added to DD/OS as a device driver • Represents itself to DD/OS as a network device, attached to a virtual interconnect • Asynchronous inbound packet delivery • Outbound – transmitter from the client via DMA • Inbound – L4 copies packets from DD/OS to client • PCI interface • More questions - When the PCI driver is isolated, it helps the other DD/OS instances discover their appropriate devices on the bus, and restricts device access to only the appropriate DD/OS instances. - ? • Executed at a lower priority than all other components • More questions - Priority is not privilege. Would not PCI performance affect system performance drastically ? Paper says PCI interface is not performance critical. Why ?
Conclusion • Provides reuse of unmodified device drivers • Network throughput within 3-8% of native Linux • Each DD/OS consumes 0.6-1.8% of CPU (approximately 0.12%)